ecotokens

14 releases (breaking)

new 0.20.0	May 8, 2026
0.17.0	Apr 29, 2026
0.10.0	Mar 28, 2026

#1225 in Command line utilities

MIT license

1.5MB
11K SLoC

ecotokens

Token-saving companion for Claude Code, Gemini CLI, Qwen Code, and Pi. Built on a "set it and forget it!" philosophy: one install command, zero configuration, and ecotokens works automatically from there — intercepting tool outputs before they reach the model, filtering the noise, and recording how many tokens you saved.

ecotokens demo

Features highlight

Feature	Details
PreToolUse hook	Intercepts every shell (`Bash`) command before its output reaches the model — filters, compresses, and records savings
PostToolUse hook (Claude Code, Gemini CLI, Qwen Code)	Intercepts native tool results (`Read`/`read_file`, `Grep`/`search_file_content`, `Glob`/`list_directory`) — outline-based compression for source files, grep trimming, glob denoising
Gain dashboard	Interactive TUI — token savings by command family or project, sparkline, diff view, history log
Multi-agent support	Works with Claude Code, Gemini CLI, Qwen Code, and Pi out of the box
Precision guarantees	Errors, failures, and stack traces are never removed; secrets are redacted before filtering
Code intelligence	BM25 + vector search (Candle, zero-config), symbol lookup, call graph tracing, near-duplicate detection
MCP server (Claude Code, Gemini CLI, Qwen Code)	Exposes code-intelligence tools over stdio (`ecotokens mcp-server`) and auto-registers in agent settings on install
AI summarization (optional)	Large outputs compressed by a local Ollama model instead of being truncated
Word abbreviations (optional)	Replace common words with shorter forms (`function`→`fn`, `configuration`→`config`, …) in narrative text, and nudge the model to do the same via a SessionStart instruction
Zero config	One `ecotokens install` command — works automatically from there

How it works

ecotokens installs hooks that intercept tool outputs before they reach the model. Two interception points are supported:

PreToolUse / BeforeTool — fires before every shell (Bash) command:

Runs the command and captures its output
Applies a family-specific filter (git, cargo, python, …)
Optionally summarizes large outputs via a local AI model (Ollama)
Returns the compressed output to the model
Records the before/after token counts in a local metrics store

PostToolUse / AfterTool (Claude Code, Gemini CLI, Qwen Code) — fires after native file-tool calls:

Intercepts the tool result before it enters the context window
Applies a specialized filter (outline for source files, grep result trimming, glob path denoising)
Returns the compressed result to the model
Records the savings under the native_read, grep, or fs family

Claude Code uses the PreToolUse + PostToolUse hooks (~/.claude/settings.json). Gemini CLI uses the BeforeTool + AfterTool hooks (~/.gemini/settings.json). Qwen Code uses the PreToolUse + PostToolUse hooks (~/.qwen/settings.json). Pi uses a TypeScript extension (~/.pi/agent/extensions/ecotokens.ts) that intercepts tool_call (bash pre-exec) and tool_result (read/grep/find/ls post-exec) events in-process.

For a focused view of the runtime path, see docs/hook-filter-metrics-flow.md.

The result: the model sees clean, concise output — and you keep your context window.

Quick install

cargo install --git https://github.com/hansipie/ecotokens

For exact token counting (tiktoken cl100k_base instead of the character heuristic):

cargo install --git https://github.com/hansipie/ecotokens --features exact-tokens

Build from source

git clone https://github.com/hansipie/ecotokens.git
cd ecotokens
cargo build --release
./target/release/ecotokens --help

To install the locally built binary into Cargo's bin directory:

cargo install --path .

With exact token counting enabled via tiktoken (cl100k_base encoding):

cargo install --path . --features exact-tokens

By default, token counts use a fast character heuristic (chars × 0.25, ~80-85% accuracy). This has no effect on filtering behavior — only the token counts recorded in metrics are more precise.

Installation

Claude Code

cargo install --path .
ecotokens install

In addition to hook installation, this also registers an MCP server entry in ~/.claude/settings.json:

{
  "mcpServers": {
    "ecotokens": {
      "command": "ecotokens",
      "args": ["mcp-server"]
    }
  }
}

Gemini CLI

Requires Gemini CLI ≥ 0.1.0.

cargo install --path .
ecotokens install --target gemini

This writes BeforeTool and AfterTool hook entries into ~/.gemini/settings.json. The AfterTool hook intercepts read_file, search_file_content, and list_directory results.

It also registers the ecotokens MCP server in ~/.gemini/settings.json.

Qwen Code

Requires Qwen Code.

cargo install --path .
ecotokens install --target qwen

This writes PreToolUse and PostToolUse hook entries into ~/.qwen/settings.json. The PostToolUse hook intercepts read_file, search_files, and list_dir results.

It also registers the ecotokens MCP server in ~/.qwen/settings.json.

Pi

Requires Pi (@mariozechner/pi-coding-agent ≥ 0.62.0).

cargo install --path .
ecotokens install --target pi

This writes a TypeScript extension to ~/.pi/agent/extensions/ecotokens.ts. Pi auto-discovers it on next startup (or /reload inside an active session). The extension intercepts bash commands before execution and filters native tool results (read, grep, find, ls) after execution.

All targets at once

ecotokens install --target all

--target all covers Claude Code, Gemini CLI, Qwen Code, and Pi in a single command.

With AI summarization

Enable AI-powered output compression via Ollama at install time:

ecotokens install --ai-summary                          # use default model (llama3.2:3b)
ecotokens install --ai-summary-model qwen2.5:3b         # specify model (implies --ai-summary)

This writes ai_summary_enabled and ai_summary_model to ~/.config/ecotokens/config.json. Ollama must be running and the model must be pulled (ollama pull llama3.2:3b).

Uninstall

ecotokens uninstall                    # Claude Code
ecotokens uninstall --target gemini    # Gemini CLI
ecotokens uninstall --target qwen      # Qwen Code
ecotokens uninstall --target pi        # Pi
ecotokens uninstall --target all       # all targets

Commands

Command	Description
`ecotokens install`	Install the PreToolUse + PostToolUse hooks and register the MCP server entry in `~/.claude/settings.json`
`ecotokens uninstall`	Remove all hooks (PreToolUse, PostToolUse, SessionStart, SessionEnd) and the MCP server entry
`ecotokens filter -- CMD [ARGS]`	Run a command, filter its output, record metrics
`ecotokens filter --cwd DIR -- CMD [ARGS]`	Same, with an explicit working directory
`ecotokens hook-post`	PostToolUse handler — intercept native tool results (Read, Grep, Glob)
`ecotokens gain`	Interactive TUI dashboard — savings by family or project
`ecotokens gain --period PERIOD`	Filter TUI to a time window (`all`, `today`, `week`, `month`)
`ecotokens gain --history`	Print a savings summary table for 24h / 7 days / 30 days
`ecotokens gain --json`	JSON report
`ecotokens config [--debug true\|false]`	Show or update global configuration (including debug mode)
`ecotokens config --model MODEL`	Set the default model used for cost calculations (empty or unknown value lists available models)
`ecotokens index [--path DIR]`	Index a codebase for BM25 + symbolic search
`ecotokens search QUERY [--context N] [--include GLOB] [--exclude GLOB] [--no-trace]`	Search the indexed codebase with line numbers, context, and optional trace augmentation
`ecotokens outline PATH`	List symbols in a file or directory
`ecotokens symbol ID`	Look up a symbol by its stable ID
`ecotokens trace callers SYMBOL`	Find callers of a symbol
`ecotokens trace callees SYMBOL`	Find callees of a symbol
`ecotokens watch [--path DIR]`	Watch a directory and keep the index up to date
`ecotokens mcp-server [--index-dir DIR]`	Start the stdio MCP server exposing search/outline/symbol/trace/duplicates tools
`ecotokens auto-watch enable`	Start watch automatically on each Claude Code session
`ecotokens auto-watch disable`	Disable automatic watch
`ecotokens abbreviations enable`	Replace common words with abbreviations in filtered outputs + inject a matching instruction at SessionStart
`ecotokens abbreviations disable`	Turn abbreviations off (default)
`ecotokens abbreviations list`	List the active dictionary (defaults merged with user overrides)
`ecotokens duplicates`	Detect near-duplicate code blocks in the indexed codebase
`ecotokens clear --all`	Delete all recorded interceptions
`ecotokens clear --before DATE`	Delete interceptions recorded before DATE (YYYY-MM-DD)
`ecotokens clear --older-than DURATION`	Delete interceptions older than a duration (e.g. `30d`, `2w`, `1m`)
`ecotokens clear --family FAMILY`	Delete interceptions of a specific command family
`ecotokens clear --project PATH`	Delete interceptions for a specific project (use `"[undefined]"` for entries without a git root)
`ecotokens completions SHELL`	Generate a shell completion script (`bash`, `zsh`, `fish`, `powershell`, `elvish`)

Shell completions

# zsh
ecotokens completions zsh > ~/.zsh/completions/_ecotokens

# bash
ecotokens completions bash > ~/.local/share/bash-completion/completions/ecotokens

# fish
ecotokens completions fish > ~/.config/fish/completions/ecotokens.fish

# PowerShell
ecotokens completions powershell >> $PROFILE

Reload your shell (or open a new terminal) to activate completions.

Gain dashboard

ecotokens gain                                          # all time, uses default model from config
ecotokens gain --period today                           # today only
ecotokens gain --period week                            # last 7 days
ecotokens gain --period month --model claude-sonnet-4-6 # last 30 days, override model
ecotokens gain --history                                # summary table: 24h / 7d / 30d
ecotokens gain --history --json                         # same, as JSON

The model used for cost calculations defaults to the value set with ecotokens config --model (or claude-sonnet-4-6 if not configured). Pass --model to override for a single invocation.

Interactive TUI showing token savings per command family and per project, with a sparkline. The --period flag filters both the stats and the history panels.

Keybindings:

Key	Action
`j` / `u`	Navigate up / down in list
`k` / `i`	Scroll history log down / up (family log view)
`l` / `o`	Scroll detail / diff / SplitRaw BEFORE panel down / up
`L` / `O`	Scroll SplitRaw AFTER panel down / up
`p`	Switch to project view (from family view)
`f`	Switch to family view (from project view)
`d`	Cycle detail mode (details → diff → split raw) — family view only
`s`	Cycle sparkline scale (linear / log / capped)
`q` / `Esc`	Quit

Filter command

ecotokens filter runs a command directly and returns its filtered output. Useful for testing filters or wrapping commands in scripts:

ecotokens filter -- cargo test
ecotokens filter --debug -- git log --oneline -50
ecotokens filter --cwd /path/to/project -- cargo test

The output is compressed by the same family-specific filters used by the hook, and token savings are recorded in the metrics store.

Watch command

ecotokens watch monitors a directory and automatically re-indexes files as they change.

ecotokens watch                    # foreground, TUI progress
ecotokens watch --path ./src       # watch a specific directory
ecotokens watch --background       # fork to background
ecotokens watch --status           # show status of background process
ecotokens watch --status --json    # JSON status output
ecotokens watch --stop             # stop the background process

Note: Background logs are only written if global debug is enabled (ecotokens config --debug true).

Auto-watch (Claude Code & Qwen Code)

ecotokens auto-watch integrates with Claude Code ~~and Qwen Code~~'s session lifecycle to start and stop the watcher automatically.

ecotokens auto-watch enable    # enable auto-watch, install SessionStart/SessionEnd hooks
ecotokens auto-watch disable   # disable (hooks remain installed but are no-ops)

When enabled, ecotokens watch --background starts automatically when a session opens, and stops when it closes. The setting is stored in ~/.config/ecotokens/config.json (auto_watch: true/false).

Note: Auto-watch relies on SessionStart / SessionEnd hooks. ~~For Qwen Code, session hooks are installed automatically if ecotokens is already installed for Qwen (ecotokens install --target qwen).~~ Gemini CLI does not expose session lifecycle hooks.

Word abbreviations

ecotokens abbreviations enable    # transform narrative text + inject model instruction
ecotokens abbreviations list      # show the active dictionary
ecotokens abbreviations disable   # back to default

When enabled, a post-processing pass replaces full words with shorter forms in the narrative parts of tool outputs (code blocks between triple backticks are preserved). A matching additionalContext payload is emitted at SessionStart so the model adopts the same abbreviations in its own responses.

See the full list of default abbreviations in docs/abbreviations.md.

Keep the feature flag in ~/.config/ecotokens/config.json

{
  "abbreviations_enabled": true
}

... and put custom pairs in a separate ~/.config/ecotokens/abbreviations.json file:

{
  "function": "func",
  "repository": "repo"
}

Bonus Tools

MCP server (Claude Code, Gemini CLI, Qwen Code)

ecotokens mcp-server starts a stdio MCP server backed by the ecotokens index and trace engines.

ecotokens mcp-server
ecotokens mcp-server --index-dir ~/.config/ecotokens/index

Exposed tools:

ecotokens_search — BM25 + semantic search
ecotokens_outline — symbol outline for file/directory
ecotokens_symbol — fetch full symbol source by stable ID
ecotokens_trace_callers — find callers of a symbol
ecotokens_trace_callees — find callees (with depth)
ecotokens_duplicates — detect near-duplicate code blocks

For Claude Code, Gemini CLI, and Qwen Code, ecotokens install registers this server automatically in each target's settings file.

Search command

ecotokens search QUERY performs BM25 (+ optional semantic) search over the indexed codebase and returns results anchored to the matching line.

ecotokens search "embed_text"                        # top 5 results, 2 lines of context
ecotokens search "embed_text" --context 4            # 4 lines above and below the match
ecotokens search "error" --include "*.rs"            # Rust files only
ecotokens search "TODO" --exclude "*.md" --exclude "*.toml"
ecotokens search "find_callers" --no-trace           # pure BM25, no trace augmentation
ecotokens search "find_callers" --json               # JSON output with callers array
ecotokens search "query" --top-k 10                  # more results

Output format:

src/search/query.rs:29 (score: 11.068)
  27:  
  28:  pub fn search_index(opts: SearchOptions) -> tantivy::Result<Vec<SearchResult>> {
  29:      let index = Index::open_in_dir(&opts.index_dir)?;
  30:      let (_, file_path_field, content_field, kind_field, line_start_field, _) = build_schema();
  31:

When the query matches a symbol name, callers are automatically appended:

# Symbol match — call sites via trace
  src/main.rs:1301 [caller]  cmd_search

Results are automatically scoped to the current git project when using the global index — files from other indexed projects are silently filtered out.

Duplicates command

Less code is less tokens

ecotokens duplicates scans the indexed codebase for near-identical code blocks and reports them grouped by similarity.

ecotokens duplicates                          # default: threshold=70%, min_lines=5
ecotokens duplicates --threshold 80           # only report ≥ 80% similarity
ecotokens duplicates --min-lines 10           # ignore blocks shorter than 10 lines
ecotokens duplicates --json                   # JSON output

Each group shows the file paths, line ranges, similarity score, and a refactoring proposal (exact duplicate, near duplicate, or subset).

Configuration

ecotokens config           # show all settings (text)
ecotokens config --json    # show all settings (JSON)

Output includes:

hook_installed        : true
debug                 : false
debuglog              : false
default_model         : claude-sonnet-4-6
exclusions            : []
embed_provider        : candle (sentence-transformers/all-MiniLM-L6-v2)
ai_summary_enabled    : false
ai_summary_model      : llama3.2:3b (default)
ai_summary_url        : http://localhost:11434 (default)
abbreviations_enabled : false

Debug mode

Enable the global debug mode to see detailed interception logs and enable background logging for the watch command:

ecotokens config --debug true
ecotokens config --debug false

This updates the debug field in ~/.config/ecotokens/config.json.

Debug file logging

Enable structured per-hook logging to a file for deeper tracing of what ecotokens intercepts:

ecotokens config --debuglog true
ecotokens config --debuglog false

When enabled, every hook invocation appends a JSONL entry to ~/.config/ecotokens/debug.log:

{"ts":"2026-05-08T12:00:00Z","uid":"a1b2c3d4","cmd":"git status","phase":"input","data":{...}}
{"ts":"2026-05-08T12:00:00Z","uid":"a1b2c3d4","cmd":"git status","phase":"output","data":{...}}

Each entry contains a short uid to correlate the input and output phases of the same invocation. Distinct from --debug (which prints to stderr) — --debuglog writes silently to disk and survives across sessions.

Default model for cost calculations

The model selected here determines the per-token price used in gain reports:

ecotokens config --model claude-opus-4-7    # set default model
ecotokens config --model ""                 # list available models
ecotokens config --model unknown-model      # unknown model → lists available models

The model name must be present in the built-in pricing table (or overridden via model_pricing in ~/.config/ecotokens/config.json). Passing an empty value or an unrecognised name prints the full list and exits.

See the full list of built-in models and prices in docs/models.md.

Override any entry or add a new model via model_pricing in ~/.config/ecotokens/config.json:

{
  "model_pricing": {
    "my-custom-model": { "input_usd_per_1m": 0.50, "output_usd_per_1m": 2.00 }
  }
}

Word abbreviations (optional)

ecotokens abbreviations enable    # transform narrative text + inject model instruction
ecotokens abbreviations list      # show the active dictionary
ecotokens abbreviations disable   # back to default

Extend or override the built-in dictionary via a separate ~/.config/ecotokens/abbreviations.json file:

{
  "function": "func",
  "repository": "repo"
}

The feature flag stays in ~/.config/ecotokens/config.json:

{
  "abbreviations_enabled": true
}

Supported command families

Family	Examples
`git`	`git status`, `git diff`, `git log`
`cargo`	`cargo build`, `cargo test`, `cargo clippy`
`python`	`pytest`, `ruff`, `uv run`, `poetry`, `pipx`
`javascript`	`jest`, `mocha`, `yarn`
`cpp`	`gcc`, `clang`, `make`, `cmake`
`fs`	`ls`, `find`, `tree`
`markdown`	`.md` files
`config`	`.toml`, `.json`, `.yaml`
`generic`	Everything else (truncated to 200 lines / 50 KB)
`native_read`	Claude Code `Read` tool results (PostToolUse, outline-based compression)

Note: Family detection uses the basename of the first token, so commands invoked via absolute path (/usr/bin/git), venv (.venv/bin/pytest), version managers (~/.cargo/bin/cargo), or wrappers (poetry run) are correctly matched to their family.

Embeddings

ecotokens search uses dual BM25 + vector retrieval with score fusion (0.4 × BM25 + 0.6 × cosine). The vector index is powered by Candle — a zero-config local embedding engine. No external service required.

Provider

Candle (default) — runs sentence-transformers/all-MiniLM-L6-v2 (384 dim) locally. The model is downloaded automatically from HuggingFace Hub on first use (~90 MB, cached in ~/.cache/huggingface/).

# Candle is active by default — nothing to configure
ecotokens index --path /your/project
ecotokens search "your query"

Each result includes a retrieval_source field (bm25, vector, or both) visible in JSON output.

Disable embeddings

ecotokens config --embed-provider none    # fall back to pure BM25

Workflow

# 1. Index your project (Candle embeddings computed automatically)
ecotokens index --path /your/project

# 2. Search with hybrid scoring
ecotokens search "your query"

# 3. JSON output with retrieval_source
ecotokens search "your query" --json

Model change detection

When the configured embedding model changes, ecotokens index automatically rebuilds the vector index (hnsw_index.bin) without touching the BM25 index. Embeddings for unchanged files are reused between runs.

AI summarization (optional)

When enabled, large command outputs (> ~2500 tokens) are summarized by a local Ollama model instead of being truncated. Falls back to generic filtering if Ollama is unavailable or times out.

Enable via install:

ecotokens install --ai-summary-model llama3.2:3b

Or update the config file directly (~/.config/ecotokens/config.json):

{
  "ai_summary_enabled": true,
  "ai_summary_model": "llama3.2:3b"
}

Ollama must be running locally. The model is called with a 3-second timeout to avoid blocking the model.

Benchmarks

Measured over 13 days on a real developer workstation (4 129 hook executions):

Metric	Value
Tokens saved	6 714 085
Overall reduction	89.6 %
Git commands	96.6 % reduction
Cargo commands	75.4 % reduction
Best single run	`git diff --staged` — 1.68M → 782 tokens (99.97 %)

→ Full benchmark report

Precision Guarantees

Filtering is aggressive on noise, conservative on signal:

Short outputs are never modified — outputs under 200 lines or 50 KB pass through unchanged
Errors are always preserved — error[, FAILED, E (pytest), --- FAIL: (Go), stack traces and panic messages are never removed
Failure sections are fully kept — structured blocks (=== FAILURES ===, failures:, failure diffs) are always passed through in their entirety
Conservative fallback — if a family filter doesn't improve the output (filtered ≥ original), the original is returned as-is
Secrets are redacted before filtering — 33 patterns covering cloud keys, AI APIs, VCS tokens, payment secrets and more are detected and replaced before any content reaches the model. See docs/secret-patterns.md for the full list.
UTF-8 safe truncation — truncation always happens at character boundaries, never mid-codepoint
Head + tail preservation — when generic truncation applies, the first and last 20 lines are always kept (start context + end result)

Requirements

Rust ≥ 1.75 (stable)
One or more of: Claude Code (with hook support), Gemini CLI ≥ 0.1.0, Qwen Code, Pi ≥ 0.62.0
Internet access on first use (Candle downloads all-MiniLM-L6-v2 ~90 MB from HuggingFace Hub; cached locally after that)
Ollama (optional, for AI summarization only)

Contributing

Contributions are welcome! Please read the contributing guidelines before submitting a pull request.

License

MIT

Dependencies

~135–180MB
~4M SLoC