Skip to content

marktoda/scribe

Repository files navigation

Scribe - Voice Transcription CLI

A powerful voice-to-text CLI tool that captures your voice, transcribes it using OpenAI Whisper, applies context-aware transformations with customizable personas, and seamlessly integrates with your clipboard.

Features

  • 🎙️ Voice Recording - Record audio directly from your terminal with multiple stop modes
  • 🔊 Whisper Transcription - High-quality speech-to-text using OpenAI Whisper API
  • 👤 Personas - Define custom writing styles and tones for different contexts (Slack, email, technical docs)
  • ✨ Smart Cleanup - Automatic punctuation, capitalization, and filler word removal
  • đź“‹ Clipboard Integration - Auto-copy to clipboard (Wayland & X11 support)
  • 📚 History Management - Browse, search, and reuse past transcriptions
  • 🎯 Custom Vocabulary - Add domain-specific terms for better transcription accuracy
  • đź”§ Fully Configurable - Customize audio settings, transformation preferences, and more

Installation

Prerequisites

System Dependencies (NixOS)

# Required
nix-env -iA nixos.ffmpeg        # Audio recording
nix-env -iA nixos.nodejs_22     # Node.js runtime

# For Wayland users
nix-env -iA nixos.wl-clipboard  # Clipboard support

# For X11 users
nix-env -iA nixos.xclip         # Clipboard support

System Dependencies (Other Linux)

# Ubuntu/Debian
sudo apt install ffmpeg nodejs npm
sudo apt install wl-clipboard    # Wayland
sudo apt install xclip          # X11

# Arch Linux
sudo pacman -S ffmpeg nodejs npm
sudo pacman -S wl-clipboard     # Wayland
sudo pacman -S xclip           # X11

Install Scribe

# Clone the repository
git clone https://github.com/marktoda/scribe.git
cd scribe

# Install dependencies
pnpm install

# Build the project
pnpm build

# Link globally
npm link

Configuration

API Keys

Set up your OpenAI API key:

export OPENAI_API_KEY="sk-..."

Config File

Scribe creates a config file at ~/.config/scribe/config.json:

{
  "openaiApiKey": "env:OPENAI_API_KEY",
  "tone": {
    "default": "light"
  },
  "audio": {
    "sampleRate": 16000,
    "mono": true,
    "saveByDefault": false
  },
  "historyLimit": 500,
  "clipboard": {
    "preferredTool": "wl-clipboard"  // or "xclip", "xsel"
  },
  "vocabulary": {
    "enabled": true,
    "words": ["API", "OAuth", "JWT"]
  }
}

Data Storage Locations

Scribe follows the XDG Base Directory specification:

  • Configuration: ~/.config/scribe/
    • config.json - Main configuration file
  • User Data: ~/.local/share/scribe/
    • personas.json - Custom personas
    • vocabulary.json - Custom vocabulary terms
    • history.jsonl - Transcription history
    • audio/ - Saved audio recordings

Quick Start

# Basic recording - press Enter to stop
scribe

# Use a persona for Slack messages
scribe -p slack

# Record with context for better transformation
scribe --context "Technical discussion about API design"

# Combine persona and context
scribe -p technical-docs --context "Documentation for auth module"

Example Use Cases

Slack Messages

# Create a Slack persona with casual, technical tone
scribe persona add slack
# Configure: lowercase sentences, brevity, channel/user mentions

# Use it for all Slack messages
scribe -p slack

Technical Documentation

# Create a technical docs persona
scribe persona add tech-docs
# Configure: formal tone, proper capitalization, detailed explanations

# Record documentation
scribe -p tech-docs --context "API endpoint documentation"

Meeting Notes

# Record with context for better structure
scribe --context "Q4 planning meeting" --save-audio

# Review history later
scribe history --limit 10

Email Drafts

# Create email persona
scribe persona add email
# Configure: professional tone, proper greetings, structured paragraphs

# Draft emails quickly
scribe -p email --context "Reply to client about project timeline"

Usage

Basic Recording

# Record and transcribe (default)
scribe

# Use a specific persona
scribe -p slack                  # Use your Slack writing style
scribe --persona technical       # Technical documentation style

# With custom context
scribe --context "Meeting notes about Q4 planning"

# Save audio file
scribe --save-audio

# Different language
scribe --lang es

# Different stop modes
scribe --stop silence            # Stop after 3 seconds of silence
scribe --stop timeout            # Stop after timeout
scribe --stop enter              # Press Enter to stop (default)

Personas - Custom Writing Styles

Personas let you define reusable writing styles for different contexts:

# Create a new persona
scribe persona add slack
# Then interactively set:
# - Display name
# - Description
# - Context/instructions for the AI

# List all personas
scribe persona list

# Edit a persona (opens in $EDITOR)
scribe persona edit slack

# Show persona details
scribe persona show slack

# Remove a persona
scribe persona remove slack

# Example: Create a technical Slack persona
scribe persona add technical-slack
# Display name: Technical Slack
# Description: For technical discussions in Slack
# Context:
#   Casual, concise, technical language
#   Start sentences with lowercase letters
#   Use bullet points for lists
#   Include channel links as <#channel-name>
#   Include user mentions as <@username>

Vocabulary Management

Improve transcription accuracy with custom vocabulary:

# Add technical terms
scribe vocab add "Kubernetes" "PostgreSQL" "GraphQL"

# List all vocabulary
scribe vocab list

# Remove terms
scribe vocab remove "GraphQL"

# Import from file
scribe vocab import ~/technical-terms.txt

# Export vocabulary
scribe vocab export ~/my-vocabulary.txt

History Management

# View recent transcriptions
scribe history

# View more entries
scribe history --limit 100

# Copy a previous entry
scribe history --copy 8b2f3a

# Remove an entry
scribe history --rm 8b2f3a

# Export as JSON
scribe history --json > transcriptions.json

# Clear all history
scribe history --clear

Command Options

scribe (default command)

Option Description Default
-p, --persona <id> Use a persona for text transformation -
--context <text> Add context to the transcription -
--context-file <path> Read context from file -
--model <model> STT model to use whisper-1
--lang <code> Language code (e.g., en, es, fr) en
--stop <mode> Stop strategy: enter, timeout, silence enter
--silence-secs <n> Seconds of silence to stop 3
--timeout <n> Recording timeout in seconds 120
--save-audio Save raw audio file false
-q, --quiet Less console output false

scribe history

Option Description Default
--limit <n> Number of entries to show 50
--json Output as JSON false
--copy <id> Copy entry to clipboard -
--rm <id> Remove entry -
--clear Clear all history false

scribe persona

Option Description Default
-q, --quiet Less output false
-f, --force Skip confirmation prompts false
--json Output as JSON (list/show) false

Actions:

  • add <id> - Create a new persona
  • edit <id> - Edit an existing persona (opens in $EDITOR)
  • list - List all personas
  • show <id> - Show details of a persona
  • remove <id> - Remove a persona
  • clear - Clear all personas

scribe vocab

Option Description Default
-q, --quiet Less output false

Actions:

  • add <words...> - Add words to vocabulary
  • remove <words...> - Remove words from vocabulary
  • list - List all vocabulary words
  • clear - Clear all vocabulary
  • import <file> - Import words from file
  • export <file> - Export words to file

Development

# Run in development mode
pnpm dev

# Run tests
pnpm test

# Lint and format
pnpm fix

# Build for production
pnpm build

Architecture

Scribe is built with a modular TypeScript architecture:

  • AudioRecorder - FFmpeg integration for audio capture with configurable stop modes
  • Transcriber - OpenAI Whisper API for speech-to-text conversion
  • TextTransformer - GPT-based text cleanup with persona support
  • PersonaStore - Manage custom writing styles and tones
  • VocabularyStore - Custom vocabulary management for improved accuracy
  • ClipboardService - Cross-platform clipboard support (Wayland/X11)
  • HistoryStore - JSONL-based transcript storage with metadata
  • ConfigLoader - User preferences and settings management

Troubleshooting

No Microphone Access

Ensure PipeWire/PulseAudio is running:

# Check audio sources
pactl list sources

# Test recording
ffmpeg -f pulse -i default -t 5 test.wav

Clipboard Not Working

Check your session type and install appropriate tools:

# Check session type
echo $XDG_SESSION_TYPE

# Wayland
nix-env -iA nixos.wl-clipboard

# X11
nix-env -iA nixos.xclip

License

MIT - See LICENSE file for details

About

voice to text transcription cli tool

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published