Skip to content

TypeWhisper/typewhisper-mac

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

804 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

TypeWhisper for Mac

License: GPL v3 macOS Swift

Speech-to-text and AI text processing for macOS. Transcribe audio using on-device AI models or cloud APIs (Groq, OpenAI, xAI/Grok), then transform the result with reusable workflows. Your voice data stays on your Mac with local models - or use cloud APIs for faster processing.

TypeWhisper 1.4 is the current stable release for macOS. It includes system-wide dictation, file transcription, unified workflows, history, dictionary, snippets, bundled integrations, the community plugin registry, and local automation APIs. Advanced surfaces like the HTTP API, CLI, widgets, watch folders, and the plugin SDK remain supported for power users and automation.

See the release readiness guide, support matrix, and release checklist for the current release definition and ship gates.

typewhisper-demo-full-with-voiceover.mp4

Screenshots

Home Dashboard Recording & Hotkeys Workflows

Installed Integrations Available Integrations File Transcription

History Dictionary Dictionary Term Packs

Snippets General Settings Advanced Settings

Recorder API Menu Bar About

What's New in 1.4

  • Redesigned Integrations - Installed, Discover, and Manual tabs now group built-in, marketplace, community, and manually installed plugins with clearer source, hosting, and capability metadata
  • Community plugin registry - 1.4 builds can discover community plugins through the new registry feed while older 1.3.x builds stay on the official marketplace feed
  • Model download controls - Local model plugins now have clearer downloaded model management, host-version gates, Hugging Face token support, and license acceptance flows where required
  • New bundled providers - xAI/Grok, Smallest AI Pulse, Supertonic TTS, refreshed OpenAI voices, Qwen3 ASR, Granite, Voxtral, and Gemma 4 expand the bundled plugin set
  • Local automation APIs - Recorder and dictionary HTTP endpoints, watch-folder file jobs, per-request STT engine/model overrides, and CLI/API improvements give power users more automation surface
  • Workflow and hotkey polish - Workflow drag reordering, multiple global hotkeys per action, prompt-boundary hardening, direct text hotkeys, and safer Apple Intelligence handling improve day-to-day dictation flows
  • Dictation reliability pass - Live transcript, recording indicators, fullscreen/fixed-display handling, USB and multichannel audio capture, route-change recovery, and hotkey startup latency all received focused fixes

Features

Transcription

  • Eleven engines - WhisperKit (99+ languages, streaming, translation), Parakeet TDT v3 (25 European languages, extremely fast), Apple SpeechAnalyzer (macOS 26+, no model download needed), Granite Speech (MLX-based), Qwen3 ASR (MLX-based), Voxtral (local Voxtral Mini 4B, MLX-based), Groq Whisper, OpenAI Whisper, Smallest Pulse, xAI/Grok STT, and OpenAI Compatible (any OpenAI-compatible API)
  • On-device or cloud - All processing happens locally on your Mac, or use Groq/OpenAI/xAI APIs for faster processing
  • Streaming preview - See partial transcription in real-time while speaking (WhisperKit)
  • Short-clip handling - Better retention of brief utterances and fewer false no-speech discards
  • File transcription - Batch-process multiple audio/video files with drag & drop
  • Subtitle export - Export transcriptions as SRT or WebVTT with timestamps

Dictation

  • System-wide - Push-to-talk, toggle, or hybrid mode via global hotkey, auto-pastes into any app
  • Modifier-key hotkeys - Use a single modifier key (Command, Shift, Option, Control) as your hotkey
  • Indicator styles - Choose Notch, Overlay, or Minimal, with optional live transcript preview where supported
  • Sound feedback - Audio cues for recording start, transcription success, and errors
  • Microphone selection - Choose a specific input device with live preview and improved recovery after route changes

AI Processing

  • Workflows - Build reusable transformations for translation, rewriting, extraction, formatting, and app-specific automation. Workflows can run automatically by app, website, or app + website combinations, from a dedicated hotkey, as a global fallback, or manually from the Workflow Palette. Hotkey workflows can either start dictation or process the current selection/clipboard directly.
  • LLM providers - Apple Intelligence (macOS 26+), Groq, OpenAI / ChatGPT, xAI/Grok, Gemini, and OpenAI Compatible with per-prompt provider and model override
  • Speech providers - System voices, xAI/Grok TTS, and experimental local Supertonic TTS can provide spoken feedback and readback
  • Local prompt processing - Gemma 4 via MLX runs on-device on Apple Silicon, with the current verified release path limited to the E2B/E4B 4-bit models
  • Translation - Translate transcriptions on-device using Apple Translate

Personalization

  • Workflow triggers - Per-app, per-website, combined app + website, hotkey, global fallback, and manual palette-only triggers for language, task, engine, prompt, and auto-submit behavior. Website matching supports subdomains
  • Dictionary - Terms improve cloud recognition accuracy. Corrections fix common transcription mistakes automatically. Auto-learns from manual corrections. Includes importable term packs
  • Localized term packs - Built-in term pack names and descriptions are localized in English and German
  • Snippets - Text shortcuts with trigger/replacement. Supports placeholders like {{DATE}}, {{TIME}}, and {{CLIPBOARD}}
  • History - Searchable transcription history with inline editing, correction detection, app context tracking, timeline grouping, filters, bulk delete, multi-select export, auto-retention, and a standalone window accessible from the tray menu

Integration & Extensibility

  • Plugin system - Extend TypeWhisper with custom LLM providers, transcription engines, TTS providers, post-processors, and action plugins. Gemini, Granite, Groq, Linear, OpenAI / ChatGPT, OpenAI Compatible, Qwen3, Smallest Pulse, Supertonic, Voxtral, Webhook, and xAI/Grok ship as bundled plugins, alongside the local engine plugins. Linear plugin enables voice-to-issue creation. See TypeWhisperPluginSDK/Plugins/README.md
  • Local model download controls - Bundled Qwen3, Granite, Voxtral, and Supertonic plugins support an optional HuggingFace token for higher rate limits and clearer download errors. Supertonic requires explicit OpenRAIL-M model-license acceptance before model assets download.
  • HTTP API - Local REST API for integration with external tools and scripts
  • CLI tool - Shell-friendly transcription via the command line
  • Discord claim service - Optional external service for Polar supporter and GitHub Sponsors Discord role claims

General

  • Home dashboard - Usage statistics, activity chart, and onboarding tutorial
  • Auto-update - Built-in updates via Sparkle with stable, release-candidate, and daily channels
  • Universal binary - Runs natively on Apple Silicon and Intel Macs
  • Widgets - Desktop widgets for usage stats, last transcription, activity chart, and transcription history
  • Multilingual UI - English and German
  • Launch at Login - Start automatically with macOS

Install

Homebrew

brew install --cask typewhisper/tap/typewhisper

Direct Download

Download the latest DMG from GitHub Releases.

Stable direct-download releases use the default Sparkle channel. Release candidates such as 1.4.0-rc* and daily builds are published as GitHub prereleases, update the shared Sparkle appcast on their own channels, and are excluded from Homebrew. Installed builds can switch channels in Settings -> About via the Update Channel picker.

Quick Start

  1. Install TypeWhisper from Homebrew or the latest DMG.
  2. Open Settings and grant Microphone plus Accessibility access.
  3. Pick an engine and, if needed, download a local model.
  4. Trigger the global hotkey and complete your first dictation.

Manual Uninstall (macOS)

These steps are for official TypeWhisper release builds on macOS. They remove the app itself, its local state, widget data, and stored secrets so you can reinstall from a clean slate.

If you installed via Homebrew, you can optionally start with:

brew uninstall --cask typewhisper

That removes the app bundle, but it does not reliably remove all files in ~/Library or TypeWhisper entries in Keychain.

If ~/Library is hidden in Finder, use Go -> Go to Folder... and paste the paths below.

  1. Quit TypeWhisper if it is running.
  2. Delete the app bundle:
    rm -rf /Applications/TypeWhisper.app
  3. Delete app data and plugins:
    rm -rf ~/Library/Application\ Support/TypeWhisper
  4. Delete preferences:
    rm -f ~/Library/Preferences/com.typewhisper.mac.plist
  5. Delete widget and app group data used by official releases:
    rm -rf ~/Library/Group\ Containers/2D8ALY3LCL.com.typewhisper.mac
  6. Remove TypeWhisper secrets from Keychain:
    • In Keychain Access, search for com.typewhisper.mac.apikey and delete matching items.
    • This includes API and plugin secrets stored under the com.typewhisper.mac.apikey.* service prefix.
    • Also remove the license items stored under service com.typewhisper.mac.apikey.license, especially the polar-license and polar-supporter accounts.
  7. If you installed the CLI tool from Settings > Advanced, remove it too:
    rm -f /usr/local/bin/typewhisper
  8. Optional: if you want to remove exported user files as well, delete:
    rm -rf ~/Documents/TypeWhisper\ Recordings
  9. Restart your Mac, then install the latest build again.

If a fresh install still crashes immediately after these steps, please open an issue and include your macOS version, how you installed TypeWhisper, and whether the crash happens on first launch or after granting permissions.

System Requirements

  • macOS 14.0 (Sonoma) or later
  • Apple Silicon (M1 or later) recommended
  • 8 GB RAM minimum, 16 GB+ recommended for larger models
  • Some features (Apple Translate, improved Settings UI) require macOS 15+. Apple Intelligence and SpeechAnalyzer require macOS 26+.

Gemma 4 Support

TypeWhisper includes a bundled local Gemma 4 plugin powered by MLX for on-device prompt processing on Apple Silicon. In the current verified release path, Gemma 4 support is limited to the dense E2B 4-bit and E4B 4-bit variants; larger or unverified variants stay visible in the UI but remain disabled until they are validated end to end.

Model Recommendations

RAM Recommended Models
< 8 GB Whisper Tiny, Whisper Base
8-16 GB Whisper Small, Whisper Large v3 Turbo, Parakeet TDT v3, Voxtral Mini 4B
> 16 GB Whisper Large v3

Build

  1. Clone the repository:

    git clone https://github.com/TypeWhisper/typewhisper-mac.git
    cd typewhisper-mac
  2. Open in Xcode 16+:

    open TypeWhisper.xcodeproj
  3. Select the TypeWhisper scheme and build (Cmd+B). Swift Package dependencies (WhisperKit, FluidAudio, Sparkle, TypeWhisperPluginSDK) resolve automatically.

  4. Run the app. It appears as a menu bar icon - open Settings to download a model.

  5. Run the automated checks before shipping changes:

    xcodebuild test -project TypeWhisper.xcodeproj -scheme TypeWhisper -destination 'platform=macOS,arch=arm64' -parallel-testing-enabled NO CODE_SIGN_IDENTITY='-' CODE_SIGNING_REQUIRED=NO CODE_SIGNING_ALLOWED=NO
    swift test --package-path TypeWhisperPluginSDK

HTTP API

The HTTP API is an advanced local automation surface. It binds to 127.0.0.1 only, is disabled by default, and is intended for local tools and scripts.

Enable the API server in Settings > Advanced (default port: 8978).

Check Status

curl http://localhost:8978/v1/status
{
  "status": "ready",
  "engine": "whisper",
  "model": "openai_whisper-large-v3_turbo",
  "supports_streaming": true,
  "supports_translation": true
}

Transcribe Audio

curl -X POST http://localhost:8978/v1/transcribe \
  -F "file=@recording.wav" \
  -F "language=en"

curl -X POST http://localhost:8978/v1/transcribe \
  -F "file=@recording.wav" \
  -F "language_hint=de" \
  -F "language_hint=en"
{
  "text": "Hello, world!",
  "language": "en",
  "duration": 2.5,
  "processing_time": 0.8,
  "engine": "whisper",
  "model": "openai_whisper-large-v3_turbo"
}

Optional parameters:

  • language - ISO 639-1 code (e.g., en, de). Omit for full auto-detection.
  • language_hint - Repeatable language hint for restricted auto-detection. Do not combine with language.
  • task - transcribe (default) or translate (translates to English, WhisperKit only).
  • target_language - ISO 639-1 code for translation target language (e.g., es, fr). Uses Apple Translate.

Uploads to /v1/transcribe are limited to 256 MiB, including stdin uploads from the CLI. Requests above that size return 413 Payload Too Large. Local CLI file paths use a direct handoff to the running TypeWhisper app instead of uploading the file bytes.

List Models

curl http://localhost:8978/v1/models
{
  "models": [
    {
      "id": "openai_whisper-large-v3_turbo",
      "engine": "whisper",
      "ready": true
    }
  ]
}

History

# Search history
curl "http://localhost:8978/v1/history?q=meeting&limit=10&offset=0"

# Delete entry
curl -X DELETE "http://localhost:8978/v1/history?id=<uuid>"

Dictionary

# List recognition terms
curl http://localhost:8978/v1/dictionary/terms

# Merge terms, or set replace=true to replace all terms
curl -X PUT http://localhost:8978/v1/dictionary/terms \
  -H "Content-Type: application/json" \
  -d '{"terms":["TypeWhisper","WhisperKit"],"replace":false}'

# Delete one term
curl -X DELETE http://localhost:8978/v1/dictionary/terms \
  -H "Content-Type: application/json" \
  -d '{"term":"TypeWhisper"}'

# List post-transcription corrections
curl http://localhost:8978/v1/dictionary/corrections

# Add or update one correction by original text
curl -X PUT http://localhost:8978/v1/dictionary/corrections \
  -H "Content-Type: application/json" \
  -d '{"original":"teh","replacement":"the","caseSensitive":false}'

# Delete one correction
curl -X DELETE http://localhost:8978/v1/dictionary/corrections \
  -H "Content-Type: application/json" \
  -d '{"original":"teh"}'

Workflows

# List all workflow-backed rules
curl http://localhost:8978/v1/rules

# Toggle a workflow-backed rule on/off
curl -X PUT "http://localhost:8978/v1/rules/toggle?id=<uuid>"

Dictation Control

# Start dictation (returns session id)
curl -X POST http://localhost:8978/v1/dictation/start

# Stop dictation (returns same session id)
curl -X POST http://localhost:8978/v1/dictation/stop

# Check whether dictation is currently recording
curl http://localhost:8978/v1/dictation/status

# Fetch status/result for a specific dictation session
curl "http://localhost:8978/v1/dictation/transcription?id=<uuid>"

Dictation control records microphone audio for system-wide insertion. A completed dictation session returns text that TypeWhisper can paste back into the active app.

Recorder Control

Recorder control uses the same recorder path as the TypeWhisper UI, including microphone capture, optional system audio capture, mixing, finalization, and final transcription. Use it for automations that need a saved recording file or meeting/system-audio transcription without auto-pasting into another app.

# Start recorder with microphone and system audio
curl -X POST "http://localhost:8978/v1/recorder/start?mic=true&system_audio=true"

# Stop the active API recorder session
curl -X POST http://localhost:8978/v1/recorder/stop

# Check whether the recorder is currently recording
curl http://localhost:8978/v1/recorder/status

# Fetch status/result for a specific recorder session
curl "http://localhost:8978/v1/recorder/session?id=<uuid>"

POST /v1/recorder/start accepts optional query flags:

  • mic - true, false, 1, or 0. If omitted, TypeWhisper uses the current recorder microphone setting.
  • system_audio - true, false, 1, or 0. If omitted, TypeWhisper uses the current recorder system-audio setting.

At least one source must be enabled. If both resolved sources are disabled, the API returns 400 Bad Request.

Start response:

{
  "id": "8F8C1F45-6D03-44D2-A38C-0C4DE4F7E5F7",
  "status": "recording"
}

Stop response:

{
  "id": "8F8C1F45-6D03-44D2-A38C-0C4DE4F7E5F7",
  "status": "finalizing"
}

Status response:

{
  "recording": true
}

Session response:

{
  "id": "8F8C1F45-6D03-44D2-A38C-0C4DE4F7E5F7",
  "status": "completed",
  "text": "Meeting notes from the recording.",
  "output_file": "/Users/alex/Documents/TypeWhisper Recordings/Recording 2026-05-20 14-30-00.m4a"
}

Recorder sessions move through recording -> finalizing -> completed or failed. If recorder transcription is disabled or produces no transcript, text is omitted and output_file still points to the finalized recording when available. Failed sessions include an error field.

Conflict and lookup behavior:

  • Starting while the recorder is already recording or finalizing returns 409 Conflict.
  • Stopping without an active API recorder session returns 409 Conflict.
  • Polling with a missing or invalid id returns 400 Bad Request.
  • Polling a valid but unknown session id returns 404 Not Found.

CLI Tool

TypeWhisper includes a command-line tool for shell-friendly transcription. It is part of the advanced automation surface and connects to the running local API server.

Installation

Install via Settings > Advanced > CLI Tool > Install. This places the typewhisper binary in /usr/local/bin.

Commands

typewhisper status              # Show server status
typewhisper models              # List available models
typewhisper transcribe file.wav # Transcribe an audio file

Options

Option Description
--port <N> Server port (default: auto-detect)
--json Output as JSON
--language <code> Source language (e.g. en, de)
--language-hint <code> Repeatable language hint for restricted auto-detection
--task <task> transcribe (default) or translate
--translate-to <code> Target language for translation

Examples

# Transcribe with language and JSON output
typewhisper transcribe recording.wav --language de --json

# Restrict auto-detection to a shortlist
typewhisper transcribe recording.wav --language-hint de --language-hint en

# Pipe audio from stdin
cat audio.wav | typewhisper transcribe -

# Use in a script
typewhisper transcribe meeting.m4a --json | jq -r '.text'

The CLI requires the API server to be running (Settings > Advanced) and follows the documented command and flag surface for the current stable release.

Local file paths are handed to the running TypeWhisper app directly, so large files do not need to fit inside an HTTP upload body. Stdin usage (typewhisper transcribe -) still uses the regular /v1/transcribe upload endpoint and is limited to 256 MiB.

Workflows

Workflows let you configure transcription, transformation, and automation behavior per application, website, combined app + website context, hotkey, global fallback, or manual palette-only workflow. For example:

  • Mail - German language, Whisper Large v3
  • Slack - English language, Parakeet TDT v3
  • Terminal - English language, auto-submit enabled
  • github.com - English cleanup workflow that matches in any browser
  • docs.google.com - German dictation workflow that translates to English

Create workflows in Settings > Workflows. Choose a template, then use Automatic to enable app, website, hotkey, or any combination of those trigger components. Always stays the global fallback, and Manual keeps the workflow palette-only. Hotkey workflows choose whether the shortcut starts dictation or processes the current selection/clipboard through the same insertion path as the Workflow Palette. Spoken language can be left on full auto-detect, fixed to one exact language, or restricted to a shortlist of likely languages for better detection accuracy. Website patterns support subdomain matching - e.g. google.com also matches docs.google.com.

When you start dictating, TypeWhisper matches the active app and browser URL against enabled workflows with the following priority:

  1. App + URL match - highest specificity (e.g. Chrome + github.com)
  2. URL-only match - cross-browser workflows (e.g. github.com in any browser)
  3. App-only match - generic app workflows (e.g. all of Chrome)
  4. Always fallback - global workflow when no more specific workflow matches

Hotkeys are direct workflow shortcuts, not context conditions in the app/URL matching order. Manual workflows are excluded from automatic dictation matching. They appear only in the Workflow Palette and use the existing Workflow Palette hotkey.

The active workflow name is shown as a badge in the indicator, together with a short explanation of why it matched.

Multiple engines can be loaded simultaneously for instant switching between workflows. Note that loading multiple local models increases memory usage. Cloud engines (Groq, OpenAI, xAI/Grok) have negligible memory overhead.

Plugins

TypeWhisper supports plugins for adding custom LLM providers, transcription engines, TTS providers, post-processors, and action plugins. Plugins are macOS .bundle files placed in ~/Library/Application Support/TypeWhisper/Plugins/.

Bundled engines and integrations (WhisperKit, Parakeet, SpeechAnalyzer, Granite, Qwen3, Voxtral, Supertonic, Groq, OpenAI, xAI/Grok, OpenAI Compatible, Gemini, Linear, Webhook, and more) are implemented as plugins and serve as reference implementations.

See TypeWhisperPluginSDK/Plugins/README.md for the full plugin development guide, including the event bus, host services API, and manifest format.

Architecture

TypeWhisper/
├── typewhisper-cli/           # Command-line tool (status, models, transcribe)
├── PluginRegistry/            # Source registry entries for community plugin feeds
├── Plugins/                # Redirect docs and legacy entrypoint for moved first-party plugin sources
├── TypeWhisperPluginSDK/   # Plugin SDK (Swift package)
│   ├── Plugins/            # First-party plugin sources and manifests
├── TypeWhisperWidgetExtension/ # WidgetKit widgets (stats, activity, history)
├── TypeWhisperWidgetShared/    # Shared widget data models
├── App/                    # App entry point, dependency injection
├── Models/                 # Data models (TranscriptionResult, Profile, PromptAction, etc.)
├── Services/
│   ├── Cloud/              # KeychainService, WavEncoder (shared cloud utilities)
│   ├── LLM/               # Apple Intelligence provider (cloud LLM providers are plugins)
│   ├── HTTPServer/         # Local REST API (HTTPServer, APIRouter, APIHandlers)
│   ├── ModelManagerService # Transcription dispatch (delegates to plugins)
│   ├── AudioRecordingService
│   ├── AudioFileService    # Audio/video - 16kHz PCM conversion
│   ├── HotkeyService
│   ├── TextInsertionService
│   ├── WorkflowService     # Workflow matching and persistence
│   ├── HistoryService      # Transcription history persistence (SwiftData)
│   ├── DictionaryService   # Custom term corrections
│   ├── SnippetService      # Text snippets with placeholders
│   ├── PromptActionService # Prompt action persistence (SwiftData)
│   ├── PromptProcessingService # LLM orchestration for prompt execution
│   ├── PluginManager       # Plugin discovery, loading, and lifecycle
│   ├── PluginRegistryService # Plugin marketplace (download, install, update)
│   ├── PostProcessingPipeline # Priority-based text processing chain
│   ├── EventBus            # Typed publish/subscribe event system
│   ├── TranslationService  # On-device translation via Apple Translate
│   ├── SubtitleExporter    # SRT/VTT export
│   └── SoundService        # Audio feedback for recording events
├── ViewModels/             # MVVM view models with Combine
├── Views/                  # SwiftUI views
└── Resources/              # Info.plist, entitlements, localization, sounds

Patterns: MVVM with ServiceContainer singleton for dependency injection. ViewModels use a static _shared pattern. Localization via String(localized:) with Localizable.xcstrings.

License

GPLv3 - see LICENSE for details. Commercial licensing available - see LICENSE-COMMERCIAL.md.

About

Local speech-to-text for macOS on-device AI, fully private, optional cloud

Topics

Resources

License

GPL-3.0, Unknown licenses found

Licenses found

GPL-3.0
LICENSE
Unknown
LICENSE-COMMERCIAL.md

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Sponsor this project

  •  

Packages

 
 
 

Contributors

Languages