Skip to content

evanfuture/voice-worker

Repository files navigation

Voice Worker

A modern TypeScript-based file monitoring and processing system with web interface that watches folders, catalogs files, and runs configurable processors with intelligent dependency management.

✨ Features

  • πŸ“ File System Monitoring - Real-time folder watching with intelligent file cataloging
  • 🌐 Modern Web Interface - Nuxt 3 web app with real-time WebSocket updates
  • πŸ—„οΈ SQLite Database - Comprehensive file metadata and processing tracking
  • ⚑ Job Queue System - BullMQ-based queue with Redis for reliable job processing
  • πŸ”§ Configurable Processors - Database-driven processor configurations with UI management
  • πŸ”— Dependency Resolution - Smart processor dependencies and execution ordering
  • πŸ”„ Self-Healing - Automatically re-queues jobs when output files are deleted
  • πŸ’° Cost Management - Real-time cost calculation and batch approval workflows
  • 🎯 Batch Processing - User-controlled approval mode with selective execution
  • 🏷️ File Organization - Advanced tagging and metadata system
  • 🎨 Design System - Integrated design tokens with Figma synchronization

πŸš€ Quick Start

Prerequisites

  • Node.js β‰₯18
  • Redis server

Installation

# Install dependencies
npm install

# Start Redis (using Docker)
docker run -d -p 6379:6379 redis:alpine

# Setup environment
cp scripts/env.example .env
# Edit .env with your configuration (OpenAI API key, etc.)

# Initialize the system
npm run setup

# Start the full system
npm run dev

Access Points

πŸ—οΈ Modern Architecture

Core Components

  • 🌐 Nuxt 3 Web App - Modern Vue.js interface with server-side rendering
  • πŸ“Š Real-time Dashboard - WebSocket-powered queue monitoring and controls
  • βš™οΈ Processor Management - Database-driven configuration with web UI
  • πŸ“ File Watcher (chokidar) - Intelligent filesystem monitoring
  • πŸ—„οΈ Database Layer (better-sqlite3) - Comprehensive metadata storage
  • ⚑ Queue System (bullmq + Redis) - Reliable job processing
  • 🎨 Design System - Style Dictionary + Figma token synchronization

Processing Flow

Input File β†’ File Watcher β†’ Database Catalog β†’ Processor Selection β†’
Job Queue β†’ Processor Execution β†’ Output File
     ↑                                                    ↓
     └── Auto-healing re-queue if output deleted ←←←←←←←←←←

Web Interface Pages

  • πŸ“Š Dashboard (/) - Queue status, job monitoring, cost controls
  • πŸ“ Files & Tags (/files) - File management, tagging, metadata
  • βš™οΈ Processor Config (/processors) - Processor configuration management
  • βœ… Approval Queue (/approval) - Batch approval and cost visualization

πŸ”§ Unified Command System

All operations use the centralized script runner:

# System Management
npm run dev              # Start full development system
npm run build           # Build all components
npm run start           # Production mode

# Maintenance & Setup
npm run setup           # Initial system setup
npm run test-services   # Validate service connections
npm run clean          # Clean temporary files

# Design System
npm run tokens:build    # Build design tokens
npm run tokens:extract  # Extract tokens from components
npm run tokens:sync    # Sync tokens with Figma

# Legacy CLI (still available)
npm run cli status     # Queue status via CLI
npm run cli pause      # Pause processing
npm run cli resume     # Resume processing

🎯 Creating Processors

Processors are TypeScript files in src/processors/ that export a processor object:

// src/processors/my-processor.ts
import type { Processor } from "../types.js";

export const processor: Processor = {
  name: "my-processor",
  input: [".txt", ".md"], // File extensions to process
  outputExt: ".processed.txt", // Output file extension
  dependsOn: ["other-processor"], // Dependencies (optional)

  async run(inputPath: string): Promise<string> {
    // Your processing logic here
    const outputPath = inputPath + ".processed.txt";
    // ... do work ...
    return outputPath;
  },
};

Built-in Processors

transcribe

  • Input: Audio/video files (.m4a, .wav, .mp3, .mp4, .mov)
  • Output: .transcript.txt
  • Technology: OpenAI Whisper-1 API with intelligent chunking for large files
  • Features: Multi-stream audio handling, automatic cleanup, cost calculation

summarize

  • Input: .transcript.txt
  • Output: .summary.txt
  • Dependencies: transcribe
  • Technology: OpenAI GPT-4 Turbo with customizable prompts
  • Features: Professional summarization, cost tracking, configurable templates

Processor Configuration

The system uses a hybrid approach:

  1. Processor Implementations - TypeScript files containing the actual processing logic
  2. Database Configurations - Control when and how processors run
  3. Web Interface - Manage configurations through the browser

Create configurations via the web interface at /processors or programmatically:

// Database-driven processor configuration
const config = {
  name: "auto-summarize-transcripts",
  processorImplementation: "summarize",
  displayName: "Auto Summarize Transcripts",
  inputExtensions: [".transcript.txt"],
  inputTags: [], // Optional tag requirements
  outputExt: ".summary.txt",
  dependsOn: ["transcribe"],
  allowUserSelection: false, // Auto-run vs manual selection
  allowDerivedFiles: true, // Can process generated files
  config: { promptPath: "prompts/summarize.md" },
};

πŸ’° Cost Management & Batch Processing

Queue Modes

  • Auto Mode (default) - Automatic processing when files are detected
  • Approval Mode - User-controlled batch approval with cost preview

Cost Features

  • Real-time Calculation - Accurate cost estimation for OpenAI services
  • Batch Visualization - Preview total costs before processing
  • Historical Tracking - Complete cost history and analysis
  • Budget Controls - Pause queue when high-cost jobs are waiting

Batch Workflow

  1. Switch to approval mode via web interface
  2. Drop files into watched folder
  3. System catalogs files and predicts processing chains
  4. Review predicted costs and select desired processing steps
  5. Create and execute approval batches
  6. Monitor progress with real-time updates

🏷️ Advanced File Organization

File Tagging System

  • Automatic Tagging - Files auto-tagged based on patterns (.transcript. β†’ transcript tag)
  • Manual Tagging - Add custom tags via web interface
  • Tag-based Processing - Processors can require specific tags
  • Metadata Storage - Rich metadata with typed values

Processing Examples

// Extension-based processing
".mp3" files β†’ transcribe processor

// Tag-based processing
files with "meeting" tag β†’ specialized meeting processor

// Hybrid approach
".txt" files with "transcript" tag β†’ summarize processor

🎨 Design System Integration

Token-based Styling

The system includes a complete design system with Figma synchronization:

<template>
  <div class="status-card">
    <h3>System Status</h3>
    <span class="status-badge">Connected</span>
  </div>
</template>

<style scoped>
.status-card {
  background: var(--color-background-card);
  padding: var(--size-spacing-lg);
  border-radius: var(--size-border-radius-lg);
  box-shadow: var(--shadow-sm);
}

.status-badge {
  background: var(--color-status-success);
  color: white;
  padding: var(--size-spacing-xs) var(--size-spacing-sm);
  border-radius: var(--size-border-radius-pill);
}
</style>

Design Token Workflow

  1. Update tokens in src/design-system/tokens.json
  2. Run npm run tokens:build to generate CSS variables
  3. Sync with Figma using npm run tokens:sync
  4. Perfect design-code consistency

πŸ—„οΈ Database Schema

Core Tables

  • files - File metadata with path, hash, and classification
  • processor_configs - Database-driven processor configurations
  • file_tags - Flexible tagging system with optional values
  • file_metadata - Typed metadata storage
  • approval_batches - User approval sessions with cost tracking
  • predicted_jobs - Processing chain predictions with cost estimates

Self-Healing Features

  • Deletion Recovery - Automatically re-queues jobs when outputs are deleted
  • Dependency Tracking - Smart re-processing when intermediate files change
  • Orphaned Job Cleanup - Removes stale jobs on system restart

πŸš€ Production Deployment

Redis Setup

# Production Redis with persistence
docker run -d \
  --name voice-worker-redis \
  -p 6379:6379 \
  -v redis-data:/data \
  redis:alpine redis-server --appendonly yes

Environment Configuration

# Copy and edit environment file
cp scripts/env.example .env

# Required environment variables
OPENAI_API_KEY=your_openai_api_key
REDIS_HOST=127.0.0.1
REDIS_PORT=6379

Process Management

# Build for production
npm run build

# Start with PM2
npm install -g pm2
pm2 start npm --name "voice-worker" -- start
pm2 startup
pm2 save

πŸ”§ Development

Development Setup

# Development with hot reload
npm run dev

# Run individual components
cd src/nuxt-web && npm run dev  # Web interface only
npm run cli status              # CLI tools
npm run test-services          # Test service connections

Adding Features

  1. New Processors - Add TypeScript files to src/processors/
  2. Web Interface - Edit Vue components in src/nuxt-web/
  3. API Endpoints - Add Nitro endpoints in src/nuxt-web/server/api/
  4. Database Changes - Extend schema in src/db/client.ts

πŸ“Š Monitoring & Debugging

Web Interface Monitoring

  • Real-time job queue status via WebSocket
  • Interactive cost summaries and trends
  • Failed job management with retry controls
  • File organization and tagging interface

CLI Monitoring

npm run cli status     # Queue statistics
npm run cli jobs       # List active jobs
npm run cli failures   # View failed jobs
npm run test-services  # Health check all services

Logs & Debugging

  • Console Logs - Structured logging with processing details
  • WebSocket Updates - Real-time status broadcasts
  • Error Tracking - Comprehensive error capture and display
  • Cost Analytics - Detailed cost breakdowns and trends

πŸ”„ Upgrade Notes

This system represents a consolidation of previous architectures:

  • Eliminated - Redundant Express server (now Nuxt-only)
  • Unified - All scripts under centralized npm run commands
  • Modernized - Parser β†’ Processor terminology consistency
  • Enhanced - Database-driven configuration vs hardcoded settings
  • Improved - Modern web interface vs CLI-only management

πŸ“š Documentation

  • Setup Guide - src/design-system/SETUP.md
  • Figma Integration - src/design-system/FIGMA_INTEGRATION.md
  • API Reference - Auto-generated from Nuxt server routes
  • Architecture Notes - AGENT_MEMORY.md (development documentation)

πŸ†˜ Troubleshooting

Common Issues

Redis Connection Failed

# Check Redis is running
docker ps | grep redis
# Restart if needed
docker restart voice-worker-redis

OpenAI API Errors

# Verify API key in .env
cat .env | grep OPENAI_API_KEY
# Test connection
npm run test-services

Web Interface Not Loading

# Check Nuxt development server
cd src/nuxt-web && npm run dev
# Check port 3000 is available

Service Health Check

# Comprehensive service validation
npm run test-services

# Individual component checks
npm run cli status      # Queue system
curl localhost:3000     # Web interface

🀝 Contributing

  1. Fork the repository
  2. Create feature branch
  3. Add tests for new functionality
  4. Update documentation
  5. Submit pull request

πŸ“„ License

MIT


Modern TypeScript β€’ Real-time Web Interface β€’ Intelligent Processing β€’ Cost-Aware Automation

About

auto-transcription agent system

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published