A powerful voice-to-text CLI tool that captures your voice, transcribes it using OpenAI Whisper, applies context-aware transformations with customizable personas, and seamlessly integrates with your clipboard.
- 🎙️ Voice Recording - Record audio directly from your terminal with multiple stop modes
- 🔊 Whisper Transcription - High-quality speech-to-text using OpenAI Whisper API
- 👤 Personas - Define custom writing styles and tones for different contexts (Slack, email, technical docs)
- ✨ Smart Cleanup - Automatic punctuation, capitalization, and filler word removal
- đź“‹ Clipboard Integration - Auto-copy to clipboard (Wayland & X11 support)
- 📚 History Management - Browse, search, and reuse past transcriptions
- 🎯 Custom Vocabulary - Add domain-specific terms for better transcription accuracy
- đź”§ Fully Configurable - Customize audio settings, transformation preferences, and more
# Required
nix-env -iA nixos.ffmpeg # Audio recording
nix-env -iA nixos.nodejs_22 # Node.js runtime
# For Wayland users
nix-env -iA nixos.wl-clipboard # Clipboard support
# For X11 users
nix-env -iA nixos.xclip # Clipboard support# Ubuntu/Debian
sudo apt install ffmpeg nodejs npm
sudo apt install wl-clipboard # Wayland
sudo apt install xclip # X11
# Arch Linux
sudo pacman -S ffmpeg nodejs npm
sudo pacman -S wl-clipboard # Wayland
sudo pacman -S xclip # X11# Clone the repository
git clone https://github.com/marktoda/scribe.git
cd scribe
# Install dependencies
pnpm install
# Build the project
pnpm build
# Link globally
npm linkSet up your OpenAI API key:
export OPENAI_API_KEY="sk-..."Scribe creates a config file at ~/.config/scribe/config.json:
{
"openaiApiKey": "env:OPENAI_API_KEY",
"tone": {
"default": "light"
},
"audio": {
"sampleRate": 16000,
"mono": true,
"saveByDefault": false
},
"historyLimit": 500,
"clipboard": {
"preferredTool": "wl-clipboard" // or "xclip", "xsel"
},
"vocabulary": {
"enabled": true,
"words": ["API", "OAuth", "JWT"]
}
}Scribe follows the XDG Base Directory specification:
- Configuration:
~/.config/scribe/config.json- Main configuration file
- User Data:
~/.local/share/scribe/personas.json- Custom personasvocabulary.json- Custom vocabulary termshistory.jsonl- Transcription historyaudio/- Saved audio recordings
# Basic recording - press Enter to stop
scribe
# Use a persona for Slack messages
scribe -p slack
# Record with context for better transformation
scribe --context "Technical discussion about API design"
# Combine persona and context
scribe -p technical-docs --context "Documentation for auth module"# Create a Slack persona with casual, technical tone
scribe persona add slack
# Configure: lowercase sentences, brevity, channel/user mentions
# Use it for all Slack messages
scribe -p slack# Create a technical docs persona
scribe persona add tech-docs
# Configure: formal tone, proper capitalization, detailed explanations
# Record documentation
scribe -p tech-docs --context "API endpoint documentation"# Record with context for better structure
scribe --context "Q4 planning meeting" --save-audio
# Review history later
scribe history --limit 10# Create email persona
scribe persona add email
# Configure: professional tone, proper greetings, structured paragraphs
# Draft emails quickly
scribe -p email --context "Reply to client about project timeline"# Record and transcribe (default)
scribe
# Use a specific persona
scribe -p slack # Use your Slack writing style
scribe --persona technical # Technical documentation style
# With custom context
scribe --context "Meeting notes about Q4 planning"
# Save audio file
scribe --save-audio
# Different language
scribe --lang es
# Different stop modes
scribe --stop silence # Stop after 3 seconds of silence
scribe --stop timeout # Stop after timeout
scribe --stop enter # Press Enter to stop (default)Personas let you define reusable writing styles for different contexts:
# Create a new persona
scribe persona add slack
# Then interactively set:
# - Display name
# - Description
# - Context/instructions for the AI
# List all personas
scribe persona list
# Edit a persona (opens in $EDITOR)
scribe persona edit slack
# Show persona details
scribe persona show slack
# Remove a persona
scribe persona remove slack
# Example: Create a technical Slack persona
scribe persona add technical-slack
# Display name: Technical Slack
# Description: For technical discussions in Slack
# Context:
# Casual, concise, technical language
# Start sentences with lowercase letters
# Use bullet points for lists
# Include channel links as <#channel-name>
# Include user mentions as <@username>Improve transcription accuracy with custom vocabulary:
# Add technical terms
scribe vocab add "Kubernetes" "PostgreSQL" "GraphQL"
# List all vocabulary
scribe vocab list
# Remove terms
scribe vocab remove "GraphQL"
# Import from file
scribe vocab import ~/technical-terms.txt
# Export vocabulary
scribe vocab export ~/my-vocabulary.txt# View recent transcriptions
scribe history
# View more entries
scribe history --limit 100
# Copy a previous entry
scribe history --copy 8b2f3a
# Remove an entry
scribe history --rm 8b2f3a
# Export as JSON
scribe history --json > transcriptions.json
# Clear all history
scribe history --clear| Option | Description | Default |
|---|---|---|
-p, --persona <id> |
Use a persona for text transformation | - |
--context <text> |
Add context to the transcription | - |
--context-file <path> |
Read context from file | - |
--model <model> |
STT model to use | whisper-1 |
--lang <code> |
Language code (e.g., en, es, fr) | en |
--stop <mode> |
Stop strategy: enter, timeout, silence | enter |
--silence-secs <n> |
Seconds of silence to stop | 3 |
--timeout <n> |
Recording timeout in seconds | 120 |
--save-audio |
Save raw audio file | false |
-q, --quiet |
Less console output | false |
| Option | Description | Default |
|---|---|---|
--limit <n> |
Number of entries to show | 50 |
--json |
Output as JSON | false |
--copy <id> |
Copy entry to clipboard | - |
--rm <id> |
Remove entry | - |
--clear |
Clear all history | false |
| Option | Description | Default |
|---|---|---|
-q, --quiet |
Less output | false |
-f, --force |
Skip confirmation prompts | false |
--json |
Output as JSON (list/show) | false |
Actions:
add <id>- Create a new personaedit <id>- Edit an existing persona (opens in $EDITOR)list- List all personasshow <id>- Show details of a personaremove <id>- Remove a personaclear- Clear all personas
| Option | Description | Default |
|---|---|---|
-q, --quiet |
Less output | false |
Actions:
add <words...>- Add words to vocabularyremove <words...>- Remove words from vocabularylist- List all vocabulary wordsclear- Clear all vocabularyimport <file>- Import words from fileexport <file>- Export words to file
# Run in development mode
pnpm dev
# Run tests
pnpm test
# Lint and format
pnpm fix
# Build for production
pnpm buildScribe is built with a modular TypeScript architecture:
- AudioRecorder - FFmpeg integration for audio capture with configurable stop modes
- Transcriber - OpenAI Whisper API for speech-to-text conversion
- TextTransformer - GPT-based text cleanup with persona support
- PersonaStore - Manage custom writing styles and tones
- VocabularyStore - Custom vocabulary management for improved accuracy
- ClipboardService - Cross-platform clipboard support (Wayland/X11)
- HistoryStore - JSONL-based transcript storage with metadata
- ConfigLoader - User preferences and settings management
Ensure PipeWire/PulseAudio is running:
# Check audio sources
pactl list sources
# Test recording
ffmpeg -f pulse -i default -t 5 test.wavCheck your session type and install appropriate tools:
# Check session type
echo $XDG_SESSION_TYPE
# Wayland
nix-env -iA nixos.wl-clipboard
# X11
nix-env -iA nixos.xclipMIT - See LICENSE file for details