STT

Hold a key, speak, release -- your words appear wherever your cursor is.

Like SuperWhisper, but free. Like Wispr Flow, but local.

Cross-platform -- macOS (Apple Silicon) and Linux (Wayland: Sway, Hyprland, etc.)
Any backend -- local MLX Whisper, any OpenAI-compatible server, whisper.cpp HTTP, or Groq cloud
Hold-to-record -- global hotkey works in any application
Free & open source -- no subscription, no cloud dependency required

Install

uv tool install git+https://github.com/jamesob/stt.git

A setup wizard runs on first launch. To update:

uv tool install --reinstall git+https://github.com/jamesob/stt.git

Linux dependencies

STT checks for missing dependencies at startup and prints install commands. For reference:

Arch Linux:

sudo pacman -S wtype wl-clipboard gtk4-layer-shell \
    gobject-introspection portaudio pipewire-pulse

Debian / Ubuntu:

sudo apt install wtype wl-clipboard gtk4-layer-shell-dev \
    libgirepository1.0-dev gir1.2-gtk-4.0 gir1.2-gtk4layershell-1.0 \
    libportaudio2 portaudio19-dev pipewire-pulse

Your user must be in the input group for keyboard capture:

sudo usermod -aG input $USER
newgrp input  # or log out and back in

macOS permissions

Grant Accessibility and Input Monitoring (System Settings > Privacy & Security) to your terminal app -- not to "stt".

Usage

stt

Action	Keys
Record	Hold trigger key (default: Right Cmd / Left Alt)
Record + Enter	Hold Shift while recording
Cancel	ESC
Quit	Ctrl+C

Configuration

Settings live in ~/.config/stt/config.yml. Run stt --config to reconfigure. See config.sample.yml for all options.

language: en
hotkey: cmd_r
sound_enabled: true

backends:
  default:
    provider: openai
    openai_base_url: http://localhost:8000
    openai_whisper_model: whisper-large-v3

order:
  - default

Providers

Provider	Backend keys	Notes
`openai`	`openai_base_url`, `openai_api_key`, `openai_whisper_model`	Any OpenAI-compatible server (vLLM, faster-whisper, etc.)
`whisper-cpp-http`	`whisper_cpp_http_url`	Local whisper.cpp HTTP server
`mlx`	`whisper_model`	Apple Silicon, offline
`parakeet`	`parakeet_model`	Apple Silicon, English only, very fast
`groq`	`groq_api_key`	Cloud, requires API key

Fallback chains

Backends listed in order are tried in sequence. If a backend with connect_timeout is unreachable, STT falls back to the next one:

backends:
  qwen:
    provider: openai
    openai_base_url: http://gpu-server:8200
    openai_whisper_model: Qwen/Qwen3-ASR-1.7B
    connect_timeout: 2
  local:
    provider: mlx
    whisper_model: large-v3-turbo

order:
  - qwen
  - local

Benchmark mode runs all backends in parallel and logs timing for comparison. The first backend in order is the primary (its result is used):

benchmark: true

Prompt tuning

The prompt setting helps Whisper recognize domain-specific terms:

prompt: Claude, Anthropic, TypeScript, React, API endpoint

Development

git clone https://github.com/jamesob/stt.git
cd stt
uv sync
uv run stt

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
src/stt		src/stt
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.benchmark.yml		config.benchmark.yml
config.sample.yml		config.sample.yml
demo.gif		demo.gif
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STT

Install

Linux dependencies

macOS permissions

Usage

Configuration

Providers

Fallback chains

Prompt tuning

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

STT

Install

Linux dependencies

macOS permissions

Usage

Configuration

Providers

Fallback chains

Prompt tuning

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages