Hold a key, speak, release -- your words appear wherever your cursor is.
Like SuperWhisper, but free. Like Wispr Flow, but local.
- Cross-platform -- macOS (Apple Silicon) and Linux (Wayland: Sway, Hyprland, etc.)
- Any backend -- local MLX Whisper, any OpenAI-compatible server, whisper.cpp HTTP, or Groq cloud
- Hold-to-record -- global hotkey works in any application
- Free & open source -- no subscription, no cloud dependency required
uv tool install git+https://github.com/jamesob/stt.gitA setup wizard runs on first launch. To update:
uv tool install --reinstall git+https://github.com/jamesob/stt.gitSTT checks for missing dependencies at startup and prints install commands. For reference:
Arch Linux:
sudo pacman -S wtype wl-clipboard gtk4-layer-shell \
gobject-introspection portaudio pipewire-pulseDebian / Ubuntu:
sudo apt install wtype wl-clipboard gtk4-layer-shell-dev \
libgirepository1.0-dev gir1.2-gtk-4.0 gir1.2-gtk4layershell-1.0 \
libportaudio2 portaudio19-dev pipewire-pulseYour user must be in the input group for keyboard capture:
sudo usermod -aG input $USER
newgrp input # or log out and back inGrant Accessibility and Input Monitoring (System Settings > Privacy & Security) to your terminal app -- not to "stt".
stt| Action | Keys |
|---|---|
| Record | Hold trigger key (default: Right Cmd / Left Alt) |
| Record + Enter | Hold Shift while recording |
| Cancel | ESC |
| Quit | Ctrl+C |
Settings live in ~/.config/stt/config.yml. Run stt --config to reconfigure. See config.sample.yml for all options.
language: en
hotkey: cmd_r
sound_enabled: true
backends:
default:
provider: openai
openai_base_url: http://localhost:8000
openai_whisper_model: whisper-large-v3
order:
- default| Provider | Backend keys | Notes |
|---|---|---|
openai |
openai_base_url, openai_api_key, openai_whisper_model |
Any OpenAI-compatible server (vLLM, faster-whisper, etc.) |
whisper-cpp-http |
whisper_cpp_http_url |
Local whisper.cpp HTTP server |
mlx |
whisper_model |
Apple Silicon, offline |
parakeet |
parakeet_model |
Apple Silicon, English only, very fast |
groq |
groq_api_key |
Cloud, requires API key |
Backends listed in order are tried in sequence. If a backend with connect_timeout is unreachable, STT falls back to the next one:
backends:
qwen:
provider: openai
openai_base_url: http://gpu-server:8200
openai_whisper_model: Qwen/Qwen3-ASR-1.7B
connect_timeout: 2
local:
provider: mlx
whisper_model: large-v3-turbo
order:
- qwen
- localBenchmark mode runs all backends in parallel and logs timing for comparison. The first backend in order is the primary (its result is used):
benchmark: trueThe prompt setting helps Whisper recognize domain-specific terms:
prompt: Claude, Anthropic, TypeScript, React, API endpointgit clone https://github.com/jamesob/stt.git
cd stt
uv sync
uv run sttMIT