Skip to content

alexzhaosheng/huko

Repository files navigation

huko

A project-scoped AI agent for the command line — end-to-end task planning, tool use, and code-reading built in.

Built on huko-engine — the embeddable agent runtime that powers this CLI (and any other host).

huko slashes token waste through configurable context compaction and a lean tool/skill surface — no summariser LLM calls, no bloated system prompts. Give it a goal and the agent drives the loop end to end: plans, reads code, runs shell, calls tools, decides when it's done. Multiple turns inside one invocation, automatically.

State (sessions, agent history, keys) lives in your project's .huko/ like .git; optional Docker sandbox; per-tool safety policy; multi-provider.

# Install
npm install -g @alexzhaosheng/huko

# Configure: a chat-based setup where you describe the project and
# huko picks the right defaults (compaction, safety, features…) for you.
# First-run? It falls back to a Q\&A wizard for provider + key + model.
huko setup

# Use
huko -- "the /api/users endpoint is returning 500 — read the handler, follow the imports, find the root cause"
huko -- "add Google OAuth to the login flow, follow the existing auth patterns in src/auth/"
cat logs/recent.log | huko -- "are these errors caused by my recent commits? check git history and tell me which commit"
huko docker run -- "audit dependencies for unmaintained packages and known CVEs"   # sandboxed

---

Why huko

  • Agent loop, not prompt+reply. Give huko a goal — the agent plans, reads code, runs shell, calls tools, and decides when it's done. "Fix this bug" instead of "tell me about this bug."
  • Token-efficient by design. Configurable compaction — five presets from 32K to max window — keeps conversation budgets predictable across models. Pure algorithmic compression: no summariser LLM call, no extra tokens spent.
  • Lean tool surface. ~13 tools in full mode, 1 in lean (~85% smaller overhead). Lightweight skill system — author markdown recipes, activate per-project or per-invocation. No sprawling plugin architecture.
  • Project as context. Sessions, keys, and config live in .huko/ like .git — cd in and huko has its memory, cd out and you're in a different world.
  • Provider-agnostic. Anthropic, OpenAI, DeepSeek, or your own OpenAI-compatible gateway. huko model current <id> to switch.
  • Sandboxable. huko docker run -- "..." runs the agent in a container with the project mounted read-only at /work.
  • Per-tool safety. Disable, deny, allow, or require-confirm per tool. Disabled tools vanish from the LLM's surface entirely.
  • Three-layer redaction. Built-in regex + custom patterns + vault (exact-string registration). Placeholders round-trip — the LLM uses \[REDACTED:foo] symbolically, the real value is expanded before execution.

---

Install

npm

npm install -g @alexzhaosheng/huko
huko --version   # confirm install — output includes commit + build date
huko --help

Requires Node.js 24+.

Docker

docker pull ghcr.io/alexzhaosheng/huko:latest
huko docker run -- "fix the bug in main.ts"

The huko docker run wrapper auto-mounts $PWD and \~/.huko, forwards your shell's API-key env vars, and hands the rest of the argv to the inner huko. Full convention in docs/docker.md.

---

Quick start

# 1. Tell huko what kind of project you're working on — it picks the right defaults.
huko setup

# 2. Run something.
huko -- "summarise what changed in this branch since main"

# 3. Talk back and forth.
huko --chat

That's the full happy path. Everything else is variations on the same shape.

---

Setup, your way (the chat-based configurator)

huko setup drops you into a short conversation with the agent. Tell it what kind of project this is"coding in this TS repo", "running maintenance scripts in CI", "long REPL for data exploration" — and it picks sensible defaults across the knobs that actually matter: compaction level, safety policy, recommended features. It then proposes them in plain language and applies after you confirm.

Three ways to begin (you'll see them as a banner the moment you run huko setup):

huko setup — agent mode (memory session, nothing persists)

Three ways to start. Say whichever sounds right:

  1. Describe what you'll use huko for
     e.g. "coding in this TS repo", "data analysis with Python",
          "running maintenance scripts in CI"
     → I'll propose sensible defaults (compaction, safety, features)
       and confirm before applying.

  2. State a specific goal
     e.g. "switch to OpenAI gpt-5", "block rm -rf for safety",
          "set up a morning git digest schedule"

  3. Ask about current state
     e.g. "what's configured?", "where's my API key resolved from?",
          "is the daemon running?"

Why path 1 (use-case description) is the recommended default: compaction level alone makes a big difference, and the right choice depends on what you'll actually do. The agent picks for you:

You said you'll use huko for… huko picks compaction.level Why
Coding / refactoring / multi-file changes large (256K) Files + plan + tool results all need to stay in scope
Long REPL / deep exploration max (95% of model window) Use everything the model offers
Maintenance / one-off scripts / CI standard (64K) Tasks complete quickly; bigger budget wastes prefill tokens
Quick triage / light queries concise (32K) Aggressive compaction is fine; turns are short

The agent also folds in use-case-driven recommendations beyond compaction: safety tripwires (safety deny bash 're:^rm -rf') for CI / unattended runs, huko vault add for credential redaction in scripts, huko skills for projects with coding conventions, and so on. Nothing is applied without your confirmation.

First-time setup: if no provider + model + key is configured yet, huko setup automatically falls back to the classic Q&A wizard (pick a provider → supply a key → pick a default model). The chat configurator kicks in afterwards on subsequent huko setup runs.

Skip the chat: huko setup --no-chat forces the classic wizard. Use this when saved config looks present but is actually broken (expired key, dead provider URL) — chat mode would boot up and immediately fail at the first LLM call.

The setup session is a memory session — nothing about the conversation persists on disk. Your configuration changes (provider, keys, safety rules, etc.) are persistent; the conversation that led to them is not.

---

Patterns

Pipe data in, get the answer out

cat logs/recent.log | huko -- "are these errors caused by my recent commits? check git and find culprits"
git diff            | huko -- "review for risky changes — read the affected files for context if needed"
ss -tulpn           | huko --json -- "list open ports as JSON" > ports.json
echo "say hi"       | huko                    # stdin alone is the prompt (lean-style usage)
huko < prompt.txt                             # file redirect works the same

When stdin is piped AND argv has a prompt, they combine: stdin is treated as input data, argv as the instruction. Mirrors how grep/jq/awk feel.

Sessions

huko sessions list                           # all chats in this project
huko sessions current                        # the active one
huko sessions switch 7                       # rejoin chat #7
huko --new -- "start a fresh thread"         # new session, becomes active
huko --new --title="auth-refactor" -- "..."  # name it on creation
huko --session=7 -- "follow-up question"     # one-shot send into session 7
huko --memory -- "one-off, leaves no trace"  # ephemeral, no on-disk state
huko --chat                                  # interactive REPL

Short flags for the high-frequency ones: -n = --new, -m = --memory, -c = --chat. (No POSIX bundling — write them separately: huko -n -m -- "...".)

--no-markdown (--no-md) skips terminal markdown rendering — useful when the LLM output contains literal \* or | that the renderer would misinterpret (shell globs, regex patterns).

Output formats

huko -- "..."           # default text (assistant answer to stdout, diag to stderr)
huko --json -- "..."    # one JSON document at task end
huko --jsonl -- "..."   # streaming events line-delimited
huko --format=json -- "..."   # same as --json (when you want to make the mode explicit)

Verbosity + diagnostics:

huko -v -- "..."         # --verbose: print every event the kernel emits
huko --quiet -- "..."    # the inverse — final answer only, drop progress diag
huko --show-tokens -- "..."   # append a token / cost summary line to stderr

Scripting / non-interactive

huko -y -- "..."                  # --no-interaction: agent CANNOT call message(type=ask);
                                  # ideal for cron, CI, and any pipe where no human
                                  # is at the keyboard
huko -y --json -- "..." > out.json

-y strips the ask enum from the message tool's schema entirely (the LLM literally can't request input). For tasks that genuinely need an answer, the agent must surface the question via message(type=result) instead.

Provider / model / key management

huko provider list
huko model list
huko model show deepseek/deepseek-v4-pro     # full record + effective context window + source
huko model update deepseek/deepseek-v4-pro --context-window=128000
                                             # pin the budget when your gateway's real limit
                                             # differs from the heuristic
huko keys list                               # shows source layer per ref
huko keys set deepseek                       # hidden prompt → writes <cwd>/.huko/keys.json (chmod 600)
huko model current anthropic/claude-sonnet-4-6
huko --lean -- "single-shot, minimal overhead"
huko --compact=extended -- "big refactor — keep more history before compacting"

Docker (sandboxed runs)

huko docker run -- "audit deps for CVEs"
cat config.yaml | huko docker run -- "is this safe for production?"
huko docker run --image myorg/huko-fork:dev -- "..."   # custom image

The container has filesystem isolation (only $PWD + \~/.huko are mounted) but full network egress by default. See docs/docker.md for the precise security model.

Safety (tool-level controls)

huko safety tool                              # list every tool + status + rule counts
huko safety disable web\_fetch                 # remove from LLM surface entirely
huko safety enable web\_fetch                  # put it back
huko safety deny bash 're:^rm -rf'            # block matching calls
huko safety allow bash '^ls '                 # auto-approve matching calls
huko safety require write\_file 're:/etc/'    # prompt operator before matching calls
huko safety unset bash 'rm -rf'               # remove a single pattern
huko safety unset bash                        # wipe the whole entry
huko safety list                              # full pattern dump per tool
huko safety check bash command='rm -rf /'     # dry-run a hypothetical call

Editing verbs default to project (<cwd>/.huko/config.json); pass --global for \~/.huko/config.json. disabled is stronger than deny — the LLM never sees the tool's name or schema, so it can't try to call it.

Vault (per-string redaction)

huko vault add github-token                   # hidden prompt for the value
huko vault add prod-db-pw --value 'p@ssw0rd!' # direct (scripting only — leaks to history)
huko vault list                               # names + lengths (NEVER values)
huko vault remove old-token                   # unregister
echo "my password is p@ssw0rd!" | huko vault test   # debug: see what gets redacted

Three-layer redaction every outbound message goes through:

  1. Built-in regex (always on) — known shapes like sk-..., ghp\_..., AKIA..., PEM private keys, JWTs.
  2. safety.redactPatterns (project / global config) — your own regex for environment-specific secret shapes.
  3. Vault (\~/.huko/vault.json, chmod 600) — exact strings you registered. Round-trips: when the LLM emits a tool call referencing a placeholder, huko expands it back to the real value before the tool runs — the LLM never sees raw, but can still USE the secret symbolically.

Storage is global only; project-specific redactions belong in regex (Layer 2). For real isolation use huko docker run to sandbox the whole agent.

---

Compaction levels — pick how much memory the agent gets

Every long agent run eventually fills the context window. huko's compactor watches the conversation token count and, when it crosses a configurable budget, summarises the older turns into a structured digest and drops the original bulk. This is pure algorithmic compression — no LLM call, no extra tokens spent on summarisation (see docs/architecture.md for the digest format).

The trigger budget is exposed as five presets plus a literal max, all targeting absolute token counts so a level means the same conversation length regardless of which model is running:

Level Target tokens Behaviour
concise 32k Aggressive — quick tasks, small context
standard (default) 64k Sensible middle ground
extended 128k Bigger tasks, more recall
large 256k Long sessions; clamps to 95% on smaller-window models
max 95% of model window Use everything the model offers (≈ 190K on Claude Sonnet, ≈ 950K on Opus-1M / GPT-5.5)

Coding work? Use --compact=large by default. Reading source, following imports, applying multi-file edits all chew through context fast — large (256K target) gives the agent enough room to keep file content, plan state, and recent tool results all in scope at once. Drop to standard for chat-style questions; reach for max only when you're explicitly working on something that needs the full window (large refactors, long-running REPLs).

# One-shot — coding tasks default to large
huko --compact=large -- "refactor server/auth/ following the existing module patterns"
huko --chat --compact=max                                   # long REPL: use everything

# Persistent
huko config set compaction.level large --project           # this project (good default for code repos)
huko config set compaction.level standard                  # globally

For the rare case where a preset doesn't fit, the raw ratio escape hatch still works:

huko --compact-threshold=0.4 -- ...      # custom: triggers at 40% of window

Setting --compact-threshold (or config.compaction.thresholdRatio) overrides any preset and the effective display flips to custom. Inspect the live setting with huko info (under Compaction:) — it shows the active level, the resolved threshold/target percentages, and the absolute token budget on the current model.

---

Browser Control

An opt-in feature that lets the agent operate your real Chrome browser through a lightweight extension. All cookies, logins, and sessions are live — the agent sees and interacts with exactly what you see.

Quick start

# 1. Load the extension (one-time setup)
huko extension         # prints the install path + the two-click steps
#    → open chrome://extensions, toggle "Developer mode",
#    → click "Load unpacked" and pick the path huko printed

# 2. Enable browser-control in chat mode
huko --chat --enable=browser-control

The extension icon shows connection status: red = disconnected, green = connected.

How it works

When browser-control is enabled in chat mode, huko starts a local WebSocket server (default port 19222). The Chrome extension connects to this server and executes commands in the user's real browsing environment. When chat mode exits, the server stops and the extension disconnects.

Configuration

Browser-control parameters live under tools.browser in huko's layered config. Inspect or change them with huko config:

Parameter Default Description
tools.browser.wsPort 19222 WebSocket port for the Chrome extension to connect to
tools.browser.defaultTimeoutMs 30000 Per-action timeout in milliseconds
tools.browser.maxScreenshotBytes 5242880 Maximum screenshot image size in bytes (5 MiB)
# Change the port for this project (e.g. port conflict)
huko config set tools.browser.wsPort 19224 --project

# Increase screenshot size limit
huko config set tools.browser.maxScreenshotBytes 10485760 --project

# Inspect current values
huko config show

Limitations

  • Chat mode only. One-shot runs (huko -- prompt) never start sidecars — browser commands will fail with a clear "server not running" error.
  • Single client. Only one Chrome extension can connect at a time.
  • Local only. The WebSocket server binds to 127.0.0.1 — no remote browser control.

Actions

The browser tool surfaces these actions to the LLM:

  • navigate — open a URL in a new tab, return visible page text
  • click — click the first element matching a CSS selector
  • type — type text into an input matching a CSS selector
  • scroll — scroll the active page (up / down / top / bottom)
  • get\_text — return visible text content of the active page
  • get\_html — return full HTML source of the active page
  • screenshot — capture a PNG screenshot
  • wait — wait for a selector to appear or a plain timeout
  • list\_pages — list all open tabs (URL + title)
  • switch\_page — switch the active tab by index

---

Skills

Skills are operator-authored markdown files that get spliced into the system prompt when activated — recipes, project conventions, deploy procedures, anything you'd otherwise re-type at the top of every chat. They are off by default: dropping a file in the skills directory has no effect until you explicitly turn it on.

Distinct from the planner's per-phase capability tagging (roles/, model-driven): skills are operator-driven and stay active for the whole session.

Authoring

Two layouts work, both recognised by file/folder name as the skill's identity:

\~/.huko/skills/<name>.md             # global, single file
\~/.huko/skills/<name>/SKILL.md       # global, folder-style (supporting assets fine)
<cwd>/.huko/skills/<name>.md         # project, same shapes — wins over global
<cwd>/.huko/skills/<name>/SKILL.md

Front matter holds a one-line description (shown in huko skills list and rendered above the body in the system prompt). Any other YAML keys are silently ignored so files written for other agent tools drop in unchanged.

---
description: Pre-deploy checklist + rollback for this service
---

When the user asks to deploy:
1. Run `pnpm test` and abort on any failure.
2. Tag the release as `v<semver>`.
3. ...

Activating

Two paths — both stack additively:

# Persistent (per project or globally)
huko config set skills.deploy.enabled true --project
huko config set skills.deploy.enabled true            # global

# One-shot (this invocation only; repeatable)
huko --skill=deploy --skill=git-workflow -- "ship 0.3.0"
huko --chat --skill=deploy

A typo in --skill=foo aborts at bootstrap with the list of searched paths. A skill listed as enabled: true in config but missing on disk warns and is skipped — config drift can't brick startup.

Discovery

huko skills list
# NAME    SOURCE   ACTIVE  DESCRIPTION
# deploy  project  yes     Pre-deploy checklist + rollback for this service
# triage  user     no      Bug-triage workflow with reproduction template

Project files shadow global files of the same name (matches the rest of the layered-config story).

When any skill is active, the chat banner ends with skills: deploy, triage, and one-shot runs print huko: skills active — deploy, triage to stderr before the LLM call. No surprise activations.

---

Web UI (daemon mode)

huko daemon start brings up a long-running HTTP + WebSocket server for the current project and prints a http://localhost:<port> URL. Open it in a browser, paste the token printed at startup, and you get the same agent loop the CLI drives — message stream, tool calls, ask/decision dialogs — in a React console.

# Start the daemon for this project
cd \~/my-project
huko daemon start
# huko daemon — running
#   Name:   my-project
#   URL:    http://127.0.0.1:3000
#   PID:    12345
#   Token:  d8e9f0...   (also at \~/.huko/daemon-token.json, chmod 600)

# Set a custom project name (persisted to <.huko/project-name>)
huko --name="My Project" daemon start
# huko daemon — running
#   Name:   My Project
#   URL:    http://127.0.0.1:3000
#   PID:    12346
#   Token:  d8e9f0...

# The name shows in `daemon list` and the web UI header
huko daemon list
# huko daemon — 2 running
#   my-project   \~/my-project   http://127.0.0.1:3000  pid 12345  uptime 12m
#   My Project   \~/other-proj   http://127.0.0.1:3001  pid 12346  uptime 1m

# Multi-project workflow
cd \~/other-project \&\& huko daemon start    # auto-allocates next free port (e.g. 3001)
huko daemon stop --all                      # batch stop

# Stop / restart preserves the URL
huko daemon stop \&\& huko daemon start
# always lands on the same port; the assignment lives at <cwd>/.huko/daemon-port
Subcommand What
huko daemon start Detached background process. Auto port on first start, fails loud if the persisted port is later taken by something else.
huko daemon stop \[--all] SIGTERM + 5s grace + SIGKILL. --all walks the registry.
huko daemon status URL, PID, uptime, token for this project.
huko daemon token Print the user-scoped token to stdout (exit 1 if none).
huko daemon list All running daemons on this machine, dead-pid filter on read.

Threat model: single Bearer token (256-bit, chmod 600, constant-time compare), localhost-only by default, no TLS. Same-user dev-machine scope. For remote access use an SSH tunnel (ssh -L 3000:localhost:3000 host); for LAN exposure set daemon.host = 0.0.0.0 and put Caddy / nginx in front for TLS. See docs/daemon.md for the full design.

---

Scheduled tasks

When the daemon is running, you can hand it a cron expression + a prompt and let it drive the agent on a schedule. Each schedule lives in one Markdown file under <cwd>/.huko/schedules/, gets its own long-lived session, and every fire becomes a new task in that session — the full history is browsable in the web UI's Schedules tab.

mkdir -p .huko/schedules
cat > .huko/schedules/morning-brief.md << 'EOF'
---
cron: "0 8 \* \* \*"
timezone: "Asia/Shanghai"
---

Each morning: run `git log --since="24 hours ago" develop`, group the
commits into feat / fix / chore, and deliver the markdown digest via
message(type=result).
EOF

huko daemon restart                  # picks up new / edited schedules

huko schedule list                   # what's loaded + next-fire times
huko schedule run morning-brief      # fire it once now (via the daemon)
huko schedule disable morning-brief  # frontmatter flip; restart to land

The body of the file is the agent's standing duty — it lands in a cache-stable <scheduled\_task> system-prompt block, while the wall-clock trigger time and the previous run's finalResult go in the trigger user message for cross-fire continuity. Scheduled tasks always run non-interactive (no message(type=ask)); concurrency is skip-if-running per schedule.

Full design + frontmatter reference + a step-by-step tutorial: docs/scheduler.md.

---

Configuration

Layered, just like git config:

built-in defaults  →  \~/.huko/{providers,config,keys}.json  →  <cwd>/.huko/{...}.json

Inspect:

huko info             # everything that's resolved + which layer set what
huko config show      # the runtime config side

Edit through the CLI:

huko config set mode lean --project          # this project only
huko config set mode lean --global           # all projects
huko safety init                             # scaffold per-tool safety rules (project)
huko safety disable web\_fetch                # see Safety section above

Or just edit the JSON files directly — huko reads them on every run, no caching.

---

Local LLMs

huko speaks OpenAI-compatible HTTP, so any local server that does — Ollama, LM Studio, vLLM, llama.cpp's llama-server, LocalAI, text-generation-webui — registers the same way as a hosted provider. Below uses Ollama; the others differ only in --base-url.

# 1. Start the local server + pull a model.
ollama serve \&
ollama pull qwen2.5-coder:7b

# 2. Register a key reference. Most local servers ignore the value but
#    expect SOME string in the Authorization header — pick any placeholder.
huko keys set ollama          # interactive (hidden prompt) — type 'EMPTY' or anything

# 3. Register the provider. Protocol is `openai` (= OpenAI-compatible API).
huko provider add \\
  --name=ollama \\
  --protocol=openai \\
  --base-url=http://127.0.0.1:11434/v1 \\
  --api-key-ref=ollama

# 4. Register the model under that provider; make it current.
huko model add \\
  --provider=ollama \\
  --model-id=qwen2.5-coder:7b \\
  --context-window=32768 \\
  --tool-call-mode=native \\
  --current

# 5. Use it.
huko -- "read main.ts and explain the architecture"

Notes:

  • --context-window= is required for local models — huko sizes compaction thresholds against it. Grab the right number from the model card or ollama show <model> (look for context length).
  • --tool-call-mode=native works when the model + server both implement OpenAI function calling (Qwen 2.5, Llama 3.1, DeepSeek family, recent Ollama). If you see empty / ignored tool calls in responses, switch to --tool-call-mode=xml — huko will encode tool calls inside the prompt and parse them from the model's text reply. Slower and a bit less reliable, but works with any model that can follow instructions.
  • Other servers' default ports: LM Studio http://127.0.0.1:1234/v1, vLLM http://127.0.0.1:8000/v1, llama.cpp llama-server http://127.0.0.1:8080/v1. All keep --protocol=openai.
  • Custom headers (internal gateway, corporate proxy, etc.) — --header=X-Foo=bar on provider add, repeatable.
  • Project-scoped registration: pass --project on provider add / model add to write to <cwd>/.huko/providers.json instead of the global \~/.huko/providers.json. Useful when a single project pins to a specific local model the rest of your machine doesn't use.

Small models (7B-and-below) typically struggle to keep a multi-step agent loop coherent — they hallucinate file paths, forget which tool they just called, or repeat themselves. Lean mode (huko --lean -- "...") is the right pairing: minimal system prompt, just bash as the tool, far less for the model to juggle. For the full agent surface, you'll generally want a 32B+ model or a hosted frontier model.

---

Documentation

Topic File
CLI surface, argv parsing, all flags docs/cli.md
Architecture overview docs/architecture.md
Configuration model docs/config.md
Per-tool safety policy run huko safety init for the template
Docker convention + key resolution docs/docker.md
CI/CD pipeline + release process docs/cicd.md
Working agreements (for contributors) CLAUDE.md

---

Development

git clone https://github.com/alexzhaosheng/huko.git
cd huko
pnpm install              # install deps (engine pulled from npm)
pnpm test                 # full suite, cross-platform (Linux / macOS / Windows × Node 24)
pnpm run check            # strict type check
pnpm run build:cli        # esbuild bundle → dist/cli.js

npm link                  # install your local checkout as `huko`

Or use the unbuilt source directly via tsx:

pnpm run huko -- "your prompt"
# (or directly: npx tsx src/cli/index.ts -- "your prompt")

CI runs the same tsc + test + build matrix on Linux/macOS/Windows × Node 24 for every PR. The Dockerfile gets a sanity build on every PR too. See docs/cicd.md for the full pipeline.

---

License

MIT. See LICENSE.

About

No description, website, or topics provided.

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors