An AI-powered video editing agent that automatically transcribes videos, processes scripts, and creates intelligent video cuts based on the content. The agent uses OpenAI's Whisper for transcription and GPT-4 for intelligent editing decisions, delivering a streamlined Streamlit interface for easy video processing.
This application requires FFmpeg to be installed on your system for video processing capabilities. MoviePy depends on FFmpeg for video and audio manipulation.
macOS:
brew install ffmpegUbuntu/Debian:
sudo apt update
sudo apt install ffmpegWindows:
- Download FFmpeg from https://ffmpeg.org/download.html
- Extract the files and add the
binfolder to your system PATH - Alternatively, use Chocolatey:
choco install ffmpeg
Verification:
ffmpeg -versionNote: The application will not function properly without FFmpeg installed. Ensure FFmpeg is accessible from your command line before running the video editing agent.
This AI agent is designed for content creators, educators, and video professionals who need to:
- Automatically transcribe video content with precise timestamps
- Remove stutters and repetitions from video recordings
- Extract best takes when multiple attempts of the same line exist
- Create clean edits based on provided scripts
- Generate downloadable segments for further editing
- Streamline post-production workflows for talking-head videos
Perfect for:
- Content creators recording tutorials or presentations
- Educators creating course materials
- Podcasters and video bloggers
- Anyone who needs to clean up recorded video content
- Teams looking to automate video editing workflows
- Video-to-Audio Conversion: Automatically extracts audio from uploaded video files
- AI Transcription: Uses OpenAI Whisper for accurate speech-to-text with word-level timestamps
- Intelligent Editing: LLM-powered analysis to identify the best takes and remove repetitions
- Script Matching: Compares transcription against provided scripts for targeted editing
- Segment Export: Creates individual video segments for each selected clip
- Batch Processing: Processes multiple segments and packages them for download
- Web Interface: Clean Streamlit UI for easy file upload and processing
- Progress Tracking: Real-time progress updates during processing
The video editing agent follows this workflow:
- File Upload: Users upload video files through the Streamlit interface
- Audio Extraction:
convert_video_to_audiouses MoviePy to extract audio tracks - Transcription:
transcribe_audioleverages OpenAI's Whisper API for accurate transcription with timestamps - LLM Processing:
process_transcription_with_llmuses GPT-4 to analyze transcripts and identify best segments - Video Cutting:
cut_video_segmentscreates individual video clips based on LLM recommendations - File Packaging:
zip_and_download_filesbundles processed segments for download
- OpenAI Whisper API: For speech-to-text transcription
- OpenAI GPT-4: For intelligent content analysis and editing decisions
- MoviePy: For video/audio processing and manipulation
- Python 3.8+
- OpenAI API key with access to Whisper and GPT-4
- Sufficient storage space for temporary video processing
- FFmpeg (automatically handled by MoviePy)
Create a .env file in the project root:
OPENAI_API_KEY=your-openai-api-key-here- Visit OpenAI API Keys
- Create a new secret key
- Copy the key to your
.envfile - Ensure you have credits and access to both Whisper and GPT-4 models
- Clone the repository:
git clone <repository-url>
cd video-editing-agent- Install dependencies:
pip install -r requirements.txt- Set up environment variables:
cp .env.example .env
# Edit .env with your OpenAI API key- Run the Streamlit application:
streamlit run main.py- Open your browser to
http://localhost:8501
video-editing-agent/
βββ main.py # Main Streamlit application
βββ requirements.txt # Python dependencies
βββ .env # Environment variables
βββ .env.example # Environment template
βββ lib/
β βββ convert.py # Video-to-audio conversion
β βββ transcribe.py # Audio transcription with Whisper
β βββ llm.py # LLM processing for editing decisions
β βββ cut_video.py # Video segment cutting
β βββ download.py # File packaging and download
βββ temp/ # Temporary processing files
βββ exports/ # Processed video segments
βββ .streamlit/
βββ config.toml # Streamlit configuration
- Start the application:
streamlit run main.py-
Upload a video file:
- Supported formats: MP4, AVI, MOV, MKV, WMV, FLV, WebM
- Maximum file size: 10GB (configurable in .streamlit/config.toml)
-
Enter your script (optional):
- Provide the intended script or talking points
- The AI will match transcription against this script
-
Click "Run" to start processing:
- Watch real-time progress updates
- View processing metrics and results
- Download the final edited segments
The interface shows progress through these stages:
- File Upload: Saving uploaded video to temporary storage
- Audio Conversion: Extracting audio track using
convert_video_to_audio - Transcription: Processing audio with
transcribe_audio - LLM Analysis: Analyzing content with
process_transcription_with_llm - Video Cutting: Creating segments with
cut_video_segments - File Preparation: Packaging downloads with
zip_and_download_files
Modify .streamlit/config.toml to adjust:
- Maximum upload file size
- Server port and browser settings
- Theme and UI customization
- Logging levels
The agent includes intelligent buffering and quality settings:
- Audio Compression: 64k bitrate, 22050 fps for efficient processing
- Video Codec: H.264 with AAC audio for compatibility
- Segment Buffering: Automatic 1-2 second padding for clean cuts
- Export Quality: Maintains original video quality in segments
The VideoEdit model defines segments with:
- Start/End timestamps: Precise timing for video cuts
- Script matching: Links segments to intended content
- Repetition removal: Automatically selects best takes
- Quality optimization: Prefers later attempts over earlier ones
- Temporary Storage: All processing files stored in
tempdirectory - Export Organization: Final segments saved to
exportswith descriptive names - Automatic Cleanup: Temporary files managed automatically
- Download Packaging: ZIP archives for easy file transfer
- Large File Uploads: Adjust
maxUploadSizein.streamlit/config.toml - API Rate Limits: Monitor OpenAI usage and implement delays if needed
- Audio Extraction Errors: Ensure video files have valid audio tracks
- Memory Issues: Process shorter videos or reduce audio quality settings
This project is only a prototype and may not cover all edge cases. User feedback is welcome for improvements.
The application provides detailed logging:
- File size and format validation
- Audio extraction confirmation
- Transcription file creation
- LLM processing results
- Segment cutting progress
- Fork the repository
- Create a feature branch
- Make your changes
- Test with various video formats
- Submit a pull request
For issues and questions:
- Check the console output for detailed error messages
- Verify your OpenAI API key has sufficient credits
- Ensure video files are in supported formats
- Review temporary file permissions
Created by Tom Shaw - https://github.com/IAmTomShaw
This project demonstrates the power of combining modern AI APIs with practical video editing workflows, making professional-quality video editing accessible through a simple web interface.