Compress verbose command output into concise summaries using a local LLM. No API keys, no external services — runs entirely on your machine.
cargo build --release 2>&1 | smelt
smelt v0.1.0 compiled, optimized, release profile in 0.92s.
smelt reads stdin, feeds it through a small local model (Qwen3 1.7B, Q4 quantized), and outputs a concise summary. Depending on the input and context, that may span more than one line. The model is automatically downloaded on first run (~1.3GB) to ~/.local/share/smelt/models/.
Inference runs on GPU via Vulkan, CUDA, or Metal. On CPU it works but is significantly slower.
When no GPU offload is active, smelt prints a startup warning and falls back to CPU inference.
Pipe any command through smelt:
npm install 2>&1 | smelt
cargo test 2>&1 | smelt
kubectl get pods -A | smelt
git log --oneline -50 | smeltWhen input exceeds the model's context window, smelt has two strategies for handling it:
--last
Keeps the last N lines that fit in context, summarizes once. Fast. Best when the end of output has the verdict (build result, test pass/fail, error messages).
--rolling (default)
Processes the input in chunks sequentially, carrying a running summary forward into each next chunk. Slower (one inference per chunk) but captures information from the entire output. When stderr is attached to a terminal, smelt also streams each intermediate compaction there as it goes, without waiting for EOF.
# Long test suite — want to know about all failures
pytest -v 2>&1 | smelt
# Long build log — just care about the result
make 2>&1 | smelt --lastUse --head and --tail to include raw lines around the summary:
# First and last 5 raw lines, with the summary in between
some-command 2>&1 | smelt --head 5 --tail 5
# Summary, then the last 10 raw lines
some-command 2>&1 | smelt --tail 10| Flag | Default | Description |
|---|---|---|
--last |
Summarize only the tail of the input | |
--rolling |
yes | Rolling summary over the full input |
--head N |
Include the first N raw lines before the summary | |
--tail N |
Include the last N raw lines after the summary | |
--prompt TEXT |
"Summarize this command output:" | Custom instruction prompt |
--ctx-size N |
8192 | Context window size in tokens |
-v, --verbose |
Show timing and progress on stderr |
Benchmarked on RTX 2070 + GTX 1060 with Vulkan:
| Input | Strategy | Time |
|---|---|---|
| 10 lines | last | 1.2s |
| 150 lines | last | 2.7s |
| 150 lines | rolling | 2.9s |
Model load is ~0.8s (cached). Inference runs at ~50 tokens/s on GPU.
Uses bartowski/Qwen_Qwen3-1.7B-GGUF (Q4_K_M, ~1.3GB), based on Qwen3-1.7B. smelt runs it in non-thinking mode for faster, more direct summaries. Downloaded automatically to ~/.local/share/smelt/models/ on first run.
Use the containerized builders if you want reproducible backend-specific binaries without relying on host toolchains:
make build-vulkan
make build-cudaThat writes artifacts to:
dist/smelt-vulkan
dist/smelt-cuda
And you can install them to ~/.local/bin with:
make install-vulkan
make install-cudaTo force a specific container runtime:
CONTAINER_RUNTIME=docker make build-vulkanThe container builds are organized around:
Containerfile.vulkan
Containerfile.cuda
They intentionally use separate images and separate binaries so a CUDA build does not require Vulkan loader libraries, and a Vulkan build does not require CUDA runtime libraries, on the target host.
# Vulkan (Linux/Windows with NVIDIA, AMD, or Intel GPUs)
cargo install --path . --features vulkan
# CUDA (NVIDIA only)
cargo install --path . --features cuda
# Metal (macOS)
cargo install --path . --features metal
# CPU only (no GPU acceleration)
cargo install --path .If you want separate binaries so each host only loads the backend it has runtime libs for, install them under different names:
bash scripts/install-backend.sh vulkan
bash scripts/install-backend.sh cuda
bash scripts/install-backend.sh cpuThat installs to:
~/.local/bin/smelt-vulkan
~/.local/bin/smelt-cuda
~/.local/bin/smelt-cpu
You can also override the output name:
bash scripts/install-backend.sh vulkan smeltArch Linux (Vulkan):
sudo pacman -S vulkan-headers shaderc vulkan-icd-loaderUbuntu/Debian (Vulkan):
sudo apt install libvulkan-dev glslang-toolsCUDA: Install the CUDA Toolkit.
MIT