Whisper - Speech-to-Text for Emacs

Support & Donations

If you find this project helpful, consider supporting it!

Whisper - Speech-to-Text for Emacs

A simple Emacs package that provides speech-to-text functionality using Whisper.cpp. Record audio directly from Emacs and have it transcribed and inserted at your cursor position.

Demo Video

Perfect Speech to Text in Emacs - See the package in action!

Features

Record audio with simple key bindings
Two transcription modes:
- Fast mode (C-c v): Uses base.en model for quick transcription
- Accurate mode (C-c n): Uses medium.en model for more accurate results
Vocabulary hints: Provide a custom vocabulary file to improve recognition of proper nouns and specialized terms (e.g., Greek names like Socrates, Alcibiades, Diotima)
Automatic transcription using Whisper.cpp
Text insertion at cursor position
Non-blocking recording with user-controlled stop

Prerequisites

Before setting up this package, you need to install the following system dependencies:

1. Sox (for audio recording)

Ubuntu/Debian:

sudo apt install sox

macOS:

brew install sox

Arch Linux:

sudo pacman -S sox

2. Whisper.cpp

Clone and build Whisper.cpp:

# Clone the repository
git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp

# Build the project
make

# Download models
# For fast mode (required)
bash ./models/download-ggml-model.sh base.en

# For accurate mode (required)
bash ./models/download-ggml-model.sh medium.en

Make sure the paths in the code match your installation:

Whisper binary: ~/whisper.cpp/build/bin/whisper-cli
Fast mode model: ~/whisper.cpp/models/ggml-base.en.bin
Accurate mode model: ~/whisper.cpp/models/ggml-medium.en.bin

If you install Whisper.cpp in a different location, you'll need to update the paths in whisper.el.

Installation

Option 1: Manual Installation

Clone or download this repository:

git clone <your-repo-url> ~/.emacs.d/whisper

Add the following to your init.el or .emacs file:

;; Add the package directory to load-path
(add-to-list 'load-path "~/.emacs.d/whisper")

;; Load the package
(require 'whisper)

Option 2: Direct File Installation

Copy whisper.el to your Emacs configuration directory:
```
cp whisper.el ~/.emacs.d/
```

Add the following to your init.el:

;; Load the whisper package
(load-file "~/.emacs.d/whisper.el")

Option 3: Using use-package

If you use use-package, add this to your init.el:

(use-package whisper
  :load-path "~/.emacs.d/whisper"
  :bind (("C-c v" . whisper-transcribe-fast)
         ("C-c n" . whisper-transcribe)))

Usage

Key Bindings

C-c v: Fast mode (base.en model) - quicker transcription, suitable for most use cases
C-c n: Accurate mode (medium.en model) - slower but more accurate transcription

Basic Workflow

Start recording: Press C-c v (fast) or C-c n (accurate) to begin recording audio
Stop recording: Press C-g to stop recording and start transcription
Get results: The transcribed text will be automatically inserted at your cursor position

Example

Open any text buffer in Emacs
Position your cursor where you want the transcribed text
Press C-c v for fast transcription or C-c n for accurate transcription
Speak into your microphone
Press C-g when finished speaking
Wait a moment for transcription to complete
The text appears at your cursor position

Configuration

Custom Key Bindings

To change the key bindings, modify your init.el:

;; Use different key bindings
(global-set-key (kbd "C-c s") #'whisper-transcribe-fast)  ; Fast mode
(global-set-key (kbd "C-c S") #'whisper-transcribe)       ; Accurate mode

Custom Model Path

To use a different model for accurate mode, set the whisper-model-path variable in your init.el:

;; Use a different model (e.g., large model for even better accuracy)
(setq whisper-model-path "~/whisper.cpp/models/ggml-large.en.bin")

Custom Paths

If your Whisper.cpp installation is in a different location, you'll need to modify the paths in whisper.el:

;; Example: if whisper-cli is in /usr/local/bin/
;; Edit the format strings in whisper.el from:
;; "~/whisper.cpp/build/bin/whisper-cli -m ~/whisper.cpp/models/ggml-base.en.bin ..."
;; to:
;; "/usr/local/bin/whisper-cli -m /path/to/your/model.bin ..."

Custom Vocabulary for Proper Nouns

To improve transcription accuracy for proper nouns, technical terms, or specialized vocabulary, create a vocabulary file at ~/.emacs.d/whisper-vocabulary.txt.

Example ~/.emacs.d/whisper-vocabulary.txt:

This transcription discusses classical Greek philosophy, including scholars and figures such as Thrasymachus, Socrates, Plato, Diotima, Alcibiades, and Phaedrus.

Custom vocabulary location:

(setq whisper-vocabulary-file "~/Documents/my-vocabulary.txt")

For detailed guidance on vocabulary formats, tips, domain-specific examples, and managing multiple vocabularies, see VOCABULARY-GUIDE.md.

Troubleshooting

Common Issues

"sox: command not found"
- Install sox using your system package manager
"whisper-cli: command not found"
- Ensure Whisper.cpp is built and the path is correct
- Check that ~/whisper.cpp/build/bin/whisper-cli exists
No audio recorded
- Check your microphone permissions
- Test sox manually: sox -d -r 16000 -c 1 -b 16 test.wav
Transcription stuck on "Processing transcription, please wait..."
- This issue has been fixed in recent versions
- The fix includes improved process sentinel handling, automatic cleanup of old processes, and proper file management
- Ensures reliable transcription completion when switching between fast and accurate modes
Transcription not working
- Verify the model files exist:
  - Fast mode: ~/whisper.cpp/models/ggml-base.en.bin
  - Accurate mode: ~/whisper.cpp/models/ggml-medium.en.bin
- Test whisper-cli manually with a wav file

Testing the Setup

Test each component individually:

# Test sox recording (record 5 seconds)
sox -d -r 16000 -c 1 -b 16 test.wav trim 0 5

# Test whisper transcription (fast mode)
~/whisper.cpp/build/bin/whisper-cli -m ~/whisper.cpp/models/ggml-base.en.bin -f test.wav

# Test whisper transcription (accurate mode)
~/whisper.cpp/build/bin/whisper-cli -m ~/whisper.cpp/models/ggml-medium.en.bin -f test.wav

How It Works

Recording: Uses sox to record audio at 16kHz, mono, 16-bit
Processing: Calls whisper-cli with the recorded audio file
Integration: Captures the output and inserts it into your Emacs buffer
Cleanup: Automatically cleans up temporary files and buffers

License

This project is released under the MIT License.

Contributing

Feel free to submit issues and pull requests to improve this package.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.history		.history
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
VOCABULARY-GUIDE.md		VOCABULARY-GUIDE.md
sample-vocabulary.txt		sample-vocabulary.txt
whisper.el		whisper.el

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

Support & Donations

Whisper - Speech-to-Text for Emacs

Demo Video

Features

Prerequisites

1. Sox (for audio recording)

2. Whisper.cpp

Installation

Option 1: Manual Installation

Option 2: Direct File Installation

Option 3: Using use-package

Usage

Key Bindings

Basic Workflow

Example

Configuration

Custom Key Bindings

Custom Model Path

Custom Paths

Custom Vocabulary for Proper Nouns

Troubleshooting

Common Issues

Testing the Setup

How It Works

License

Contributing

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Languages

Uh oh!

License

emacsmirror/whisper

Folders and files

Latest commit

History

Repository files navigation

Support & Donations

Whisper - Speech-to-Text for Emacs

Demo Video

Features

Prerequisites

1. Sox (for audio recording)

2. Whisper.cpp

Installation

Option 1: Manual Installation

Option 2: Direct File Installation

Option 3: Using use-package

Usage

Key Bindings

Basic Workflow

Example

Configuration

Custom Key Bindings

Custom Model Path

Custom Paths

Custom Vocabulary for Proper Nouns

Troubleshooting

Common Issues

Testing the Setup

How It Works

License

Contributing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Languages

Packages