All notable changes to the DuckDB Whisper Extension will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Initial release of the DuckDB Whisper Extension
- Core transcription functions:
whisper_transcribe()- Transcribe audio to text (scalar function)whisper_transcribe_segments()- Transcribe with timestamps and metadata (table function)
- Model management functions:
whisper_list_models()- List all available models with download statuswhisper_download_model()- Get download instructions for a model
- Utility functions:
whisper_version()- Get extension and whisper.cpp version infowhisper_check_audio()- Validate audio fileswhisper_audio_info()- Get audio file metadata
- Configuration functions (session-persistent):
whisper_set_model()- Set default modelwhisper_set_model_path()- Set model storage directorywhisper_set_language()- Set default languagewhisper_set_threads()- Set thread countwhisper_get_config()- View current configuration
- Recording functions (requires SDL2, enabled by default):
whisper_list_devices()- List audio input deviceswhisper_record()- Record from microphone and transcribewhisper_record_translate()- Record and translate to Englishwhisper_record_auto()- Record with automatic silence detection (configurable)
- Support for all Whisper models (tiny through large-v3-turbo)
- Support for multiple audio formats via FFmpeg (WAV, MP3, FLAC, OGG, AAC, etc.)
- BLOB input support for transcribing audio from memory
- Automatic audio conversion to 16kHz mono (Whisper requirement)
- Model caching for improved performance on repeated transcriptions
- SQL test suite with 34 test assertions
- Comprehensive documentation
- Compile-time option
WHISPER_ENABLE_RECORDING(ON by default)
- Built on whisper.cpp v1.8.3
- Compatible with DuckDB v1.4.x
- Uses FFmpeg for audio decoding
- Uses SDL2 for audio recording (optional)
- Supports macOS (ARM64, x86_64) and Linux