26 voice models, cloud and local, built for hyprland dictation.
Press a toggle key, speak, and get instant text input. Built natively for Wayland/Hyprland with clean PipeWire capture and robust text injection.
- 26 speech-to-text models across cloud and local providers, including whisper.cpp.
- Optional LLM post-processing for grammar, punctuation, and more.
- Toggle workflow with optional status notifications and cancel support.
- Text injection via ydotool, wtype, and clipboard fallback with clipboard restore.
- Guided onboarding and a full configure menu with hot-reload.
- Personalization through custom prompt and keywords sent both to LLM and to voice model.
- Whisprflow quality but for linux and open source.
- Support for streaming models for blazing fast transcription.
All supported speech-to-text providers and models:
whisper-1(batch)gpt-4o-transcribe(batch)gpt-4o-mini-transcribe(batch)gpt-4o-realtime-preview(streaming)
whisper-large-v3whisper-large-v3-turbo
voxtral-mini-latest
scribe_v1(batch)scribe_v2(batch)scribe_v2_realtime(streaming)
- English-only:
tiny.en,base.en,small.en,medium.en - Multilingual:
tiny,base,small,medium,large-v1,large-v2,large-v3,large-v3-turbo
flux-general-ennova-3nova-2
yay -S hyprvoice-bin
# or
paru -S hyprvoice-binThe package installs system dependencies and the systemd user service. You'll still need an API key for a cloud provider, or whisper.cpp for local transcription. Onboarding will guide you through the choice.
- Run onboarding:
hyprvoice onboarding- Enable and start the service:
systemctl --user enable --now hyprvoice.service- Add a keybinding (Hyprland example):
bind = SUPER, R, exec, hyprvoice toggleSee Hyprland Keybindings for push-to-talk and other patterns.
- Test voice input:
hyprvoice toggleRun hyprvoice configure anytime for advanced settings.
# ~/.config/hypr/hyprland.conf
bind = SUPER, R, exec, hyprvoice toggleEach press toggles between recording and idle.
Combine both bind types to get hold-to-record behavior — press to start, release to stop:
# ~/.config/hypr/hyprland.conf
bind = SUPER, R, exec, hyprvoice toggle # key down → start recording
bindr = SUPER, R, exec, hyprvoice toggle # key up → stop and transcribeThis gives a walkie-talkie feel: hold the key while speaking, release when done. The daemon receives two toggle commands — the first starts recording, the second stops it and triggers transcription.
| Keyword | Fires on |
|---|---|
bind |
Key press (down) |
bindr |
Key release (up) |
With bindr, modifier keys (SUPER, CTRL, etc.) are fully released before the command executes. This can prevent modifiers from interfering with text injection.
hyprvoice onboarding
hyprvoice configure
hyprvoice serve
hyprvoice toggle
hyprvoice cancel
hyprvoice status
hyprvoice version
hyprvoice stophyprvoice model list
hyprvoice model list --provider whisper-cpp
hyprvoice model download base.en
hyprvoice model remove base.enhyprvoice test-models
hyprvoice test-models --audio /path/to/sample.wav --output test-models.jsonsystemctl --user status hyprvoice.service
systemctl --user restart hyprvoice.service
journalctl --user -u hyprvoice.service -fConfiguration lives in ~/.config/hyprvoice/config.toml and hot-reloads automatically.
- First-time setup:
hyprvoice onboarding - Full TUI editor:
hyprvoice configure
docs/config.md- configuration reference and examplesdocs/providers.md- provider and model detailsdocs/architecture.md- architecture and adapter overviewdocs/structure.md- code map and entry pointsdocs/testing.md- integration testing with test-models
Daemon won't start:
# Check if already running
hyprvoice status
# Check for stale files
ls -la ~/.cache/hyprvoice/
# Clean up and restart
rm -f ~/.cache/hyprvoice/hyprvoice.pid
rm -f ~/.cache/hyprvoice/control.sock
hyprvoice serveCommand not found:
# Check installation
which hyprvoice
# Add to PATH if using ~/.local/bin
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc
source ~/.bashrcNo audio recording:
# Check PipeWire is running
systemctl --user status pipewire
# Test microphone
pw-record --help
pw-record test.wav
# Check microphone permissions and levelsAudio device issues:
# List available audio devices
pw-cli list-objects | grep -A5 -B5 Audio
# Check microphone is not muted in system settingsNo desktop notifications:
# Test notify-send directly
notify-send "Test" "This is a test notification"
# Install if missing
sudo pacman -S libnotify # Arch
sudo apt install libnotify-bin # Ubuntu/DebianText not appearing:
-
Ensure cursor is in a text field when toggling off recording
-
Check that
wtypeandwl-clipboardtools are installed:# Test wtype directly wtype "test text" # Test clipboard tools echo "test" | wl-copy wl-paste
-
Verify Wayland compositor supports text input protocols
-
Check injection backends in configuration (fallback chain is most robust)
Clipboard issues:
# Install wl-clipboard if missing
sudo pacman -S wl-clipboard # Arch
sudo apt install wl-clipboard # Ubuntu/Debian
# Test clipboard functionality
wl-copy "test text"
wl-paste# Run daemon with verbose output
hyprvoice serve
# Check logs from systemd service (or just see results from hyprvoice serve)
journalctl --user -u hyprvoice.service -f
# Test individual commands
hyprvoice toggle
hyprvoice statusHyprvoice uses a daemon + pipeline architecture for efficient resource management:
- Control Daemon: Lightweight IPC server managing lifecycle
- Pipeline: Stateful audio processing (recording → transcribing → processing → injecting)
- State Machine:
idle → recording → transcribing → processing → injecting → idle
flowchart LR
subgraph Client
CLI["CLI/Tool"]
end
subgraph Daemon
D["Control Daemon (lifecycle + IPC)"]
end
subgraph Pipeline
A["Audio Capture"]
T["Transcribing"]
I["Injecting (wtype + clipboard)"]
end
N["notify-send/log"]
CLI -- unix socket --> D
D -- start/stop --> A
A -- frames --> T
T -- status --> D
D -- events --> N
D -- inject action --> T
T --> I
I -->|done| D
stateDiagram-v2
[*] --> idle
idle --> recording: toggle
recording --> transcribing: first_frame
transcribing --> processing: llm_enabled
transcribing --> injecting: llm_disabled
processing --> injecting: inject_action
injecting --> idle: done
recording --> idle: abort
injecting --> idle: abort
- Toggle recording → Pipeline starts, audio capture begins
- Audio streaming → PipeWire frames buffered for transcription
- Toggle stop → Recording ends, transcription starts
- LLM processing → Text cleaned up (if enabled)
- Text injection → Result typed or copied to clipboard
- Return to idle → Pipeline cleaned up, ready for next session
toggle(daemon) → create pipeline → recording- First frame arrives → transcribing (daemon may notify
Transcribinglater) - Audio frames → audio buffer (collect all audio during session)
- Second
toggleduring transcribing → transcribe collected audio - If LLM enabled → processing → clean up text with LLM
- injecting → type or paste text
- Complete → idle; pipeline stops; daemon clears reference
- Notifications at key transitions
MIT License - see LICENSE.md for details.