Press a hotkey in any macOS app. Speak naturally. Watch your words appear in real time. All transcription happens locally on Apple Silicon.
Works in any macOS app - browsers, text editors, Terminal, Slack, Messages. Text is injected wherever your cursor is focused.
Live mode shows text as you speak. Only the changed parts update - no flickering, no jumping.
Runs AI speech models directly on Apple Silicon. No cloud requests. No API keys. No internet required.
Three models to match your needs: base (~150 MB) for speed, small (~500 MB) for balance, or large-v3-turbo (~3 GB) for maximum accuracy.
F5, ⌥Space, double-tap Right ⌘, and more. Seven options so you never clash with existing bindings.
No dock icon. No main window. Sits in your menu bar, invisible until you need it. A floating HUD shows recording state.
Download the DMG. Drag Fisper to Applications. Open it from your menu bar.
Grant Microphone and Accessibility access in System Settings. Fisper detects the change instantly - no restart needed.
Press F5 (or your preferred shortcut) in any application. Speak naturally. Your words appear at the cursor in real time.
Everything is configurable from the menu bar. Pick a Whisper model based on your speed and accuracy needs, switch between live streaming and batch transcription, and choose a global shortcut that fits your workflow.
Enable "Auto Submit in Terminal" in settings and Fisper detects when your cursor is in a terminal. When you stop recording, it automatically transcribes and submits the message - perfect for working with Claude Code and other terminal-based AI agents.
Every syllable is processed on-device by Apple Neural Engine. Audio samples live only in memory during transcription. There are no accounts, no analytics, no telemetry, no cloud.
Yes. No trials, no subscriptions, no in-app purchases. Free forever.
After the initial model download, everything runs on-device. Turn off Wi-Fi and it works the same.
Any Apple Silicon Mac (M1 or later) running macOS 14 Sonoma or newer. Intel Macs are not supported because Whisper models require the Neural Engine.
English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, Hindi, Turkish, Ukrainian, Polish, Dutch, and Russian. You can also set it to auto-detect.
Depends on the model. Large-v3-turbo rivals cloud services. Base is faster but less precise. Pick the trade-off that works for you.
It types text by simulating keystrokes into whatever app is focused. macOS requires Accessibility access for that. Without it, Fisper can record but cannot inject text.
Not if you pick a different shortcut. The default F5 does not clash with Apple's dictation trigger. Six other hotkey options are available.
Base is ~150 MB, Small is ~500 MB, Large-v3-turbo is ~3 GB. Models are downloaded once and cached locally.
Stop typing what you could just say. Stop sending your voice to the cloud.
Download. Permit. Press F5.
Three steps. No accounts. No signup. No setup beyond that.