Tired of manually cutting out silent pauses in your videos? VideoSpeeder is here to revolutionize your workflow! β‘οΈ
This tool intelligently analyzes your video, detects silent segments, and automatically speeds them up, saving you precious editing time. Perfect for:
- ποΈ Podcasters & YouTubers: Condense long recordings quickly.
- π¨βπ« Educators & Presenters: Make lectures more engaging.
- π» Coders & Tutorial Makers: Skip the thinking pauses in screen recordings.
- ...anyone who wants faster, tighter videos!
π₯ Have a look at a video that uses it here.
- π£οΈ Speech Detection (VAD): Uses Silero VAD (on by default) to detect actual human speech β ignores wind, keyboard, mouse, and ambient noise.
- π€« Fallback Silence Detection:
--no-vadmode usesffmpeg'ssilencedetectfilter for amplitude-based detection (useful for clean audio). - β© Dynamic Speed-Up: Automatically calculates the optimal speed for silent parts, aiming for a concise ~4-second duration while capping at a blazing 1000x! Shorter silences get a fixed 4x boost.
- π Rich CLI Stats: Get beautifully formatted video stats upfront using
rich. - ποΈ Visual Speed Indicator: Optional
>> [Speed]xoverlay shows exactly when the video is sped up. - π‘ Transcription Power: Includes a separate script (
transcribe.py) leveraging OpenAI Whisper for highly accurate VTT subtitle generation. Choose your model size! - βοΈ Fine-Tuned Control: Adjust silence detection
threshold(dB) and minimumduration(seconds). - β±οΈ Segment Processing: Process only specific parts of your video using
--offsetand--process-duration. - π GPU Acceleration: Supports NVIDIA NVENC for encoding and CUVID/NVDEC for decoding (requires compatible hardware, drivers, and FFmpeg build) for significantly faster processing.
- β³ Progress Bar: Keep track of the process with a
tqdmprogress bar.
Just is a command runner. Install it, then:
# 1. Set up the Python venv (installs torch, silero-vad, etc.)
just venv
# 2. Speed up a single video (VAD + GPU on by default)
just speed input=my_video.mp4 out_vad=my_video_fast.mp4
# 3. Multi-angle: detect on one file, speed up all files in a folder
just speed-all folder=recordings/ master=recordings/facecam.mp4 folder_output=recordings/output/
# 4. Browse output files in your browser (requires Docker)
just browse browse_dir=recordings/output/Run just with no args to see all available recipes.
- Python 3.x
- FFmpeg & FFprobe: Must be installed and accessible in your system's PATH.
- Python Packages: Listed in
videospeeder_project/requirements.txt. Install them via pip.(Note:pip install -r videospeeder_project/requirements.txt
--vadrequires installing PyTorch + Silero VAD; this can add significant install size.) - (Optional) NVIDIA GPU & Drivers: For GPU acceleration features.
- (Optional) CUDA Toolkit: Often needed for Whisper's GPU support during transcription.
- Clone the repository:
git clone https://github.com/jakkaj/videospeeder.git cd videospeeder - Install Python dependencies:
(Note: The
# Navigate into the project directory if you aren't already cd videospeeder_project pip install -r requirements.txt # Or use the Makefile shortcut from the project root # make install # (Run this from within videospeeder_project/)
make installcommand in the Makefile assumes you are inside thevideospeeder_projectdirectory).
VideoSpeeder consists of two main scripts located in the videospeeder_project directory.
Run this script from within the videospeeder_project directory.
python videospeeder.py --input <your_video.mp4> --output <output_video.mp4> [OPTIONS]Common Options:
-i, --input: Path to your input video file (Required).-o, --output: Path for the processed output video file (Required).-t, --threshold: Silence threshold in dB (Default: -30.0). Lower values detect quieter sounds as silence.-d, --duration: Minimum duration of silence in seconds to be sped up (Default: 2.0).--no-vad: Disable VAD and fall back to FFmpeg silencedetect (amplitude-based).--vad-threshold: Speech probability threshold in[0.0, 1.0](Default:0.75). Higher rejects more noise.--indicator: Show the>> [Speed]xoverlay during sped-up parts.--gpu: Enable NVIDIA NVENC GPU encoding.--gpu-decode: Enable NVIDIA CUVID/NVDEC GPU decoding (Experimental).--offset: Start processing from this time (in seconds).--process-duration: Process only this duration (in seconds) from the start (or offset).--detect: Detect speech/silence and write a.vad.jsonsidecar file next to the input video, then exit. No output video is produced.--vad-json: Load silence intervals from a previously generated.vad.jsonsidecar instead of running live detection.--quiet: Suppress informational output (progress bars and summary still shown).--folder: Process all video files in a directory using a shared.vad.jsonsidecar.--vad-master: Master video file for VAD detection in folder mode. Detects speech on this file, writes a sidecar, then processes all videos in--folder.--overwrite: Re-process videos even if output files already exist (default: skip existing).--extensions: Comma-separated video file extensions for folder mode (Default:mp4,mkv,mov,avi,webm).--parallel N: Process N videos simultaneously in folder mode (Default: 1). Best with--gpu; each video uses one NVENC session.
Example:
# Process a video (VAD speech detection is on by default)
python videospeeder.py -i my_recording.mp4 -o my_recording_fast.mp4
# With GPU encoding and speed indicator overlay
python videospeeder.py -i my_recording.mp4 -o my_recording_fast.mp4 --gpu --indicator
# Tune VAD sensitivity for very noisy audio
python videospeeder.py -i my_recording.mp4 -o my_recording_fast.mp4 --vad-threshold 0.80
# Fall back to amplitude-based detection (no VAD)
python videospeeder.py -i my_recording.mp4 -o my_recording_fast.mp4 --no-vad -t -40If you record with multiple camera angles (e.g. facecam, screen capture, overhead), VideoSpeeder can process them all with identical timing from a single speech-detection pass.
Two-step workflow (detect once, then process all):
# Step 1: Detect speech on the master audio source (writes facecam.vad.json)
python videospeeder.py -i recordings/facecam.mp4 --detect
# Step 2: Process all videos in the folder using the shared sidecar
python videospeeder.py --folder recordings/ -o recordings/output/ --gpuOne-liner workflow (detect + process in a single command):
python videospeeder.py --folder recordings/ --vad-master facecam.mp4 -o recordings/output/ --gpuParallel processing (process 2 videos at once for ~1.8x faster throughput):
python videospeeder.py --folder recordings/ -o recordings/output/ --gpu --parallel 2Both workflows produce output files with identical segment timing, so they stay in perfect sync when imported into your NLE.
Folder mode features:
- Auto-discovers the
.vad.jsonsidecar in the folder (errors if zero or multiple found) - Skips already-processed outputs by default; use
--overwriteto re-process - Continues processing if a single file fails, with a summary at the end
- Handles duration mismatches between angles (truncates silence intervals per video)
--parallel Nprocesses N videos simultaneously;--parallel 2is the sweet spot for GPU encoding. Consumer GPUs support 8-12 concurrent NVENC sessions
Offline note for --vad:
- Silero VAD may download/cache model assets on first use. If you're offline, run once while online or preload
the model in your environment before using
--vadwithout network.
Run this script from within the videospeeder_project directory.
python transcribe.py --input <your_video_or_audio.mp4> --output <subtitles.vtt> [OPTIONS]Common Options:
-i, --input: Path to your input video or audio file (Required).-o, --output: Path for the output VTT subtitle file (Required).-m, --model: Whisper model size (Choices:tiny,base,small,medium,large. Default:large). Larger models are more accurate but require more resources.
Example:
# Transcribe a video using the 'medium' Whisper model
python transcribe.py -i my_recording_fast.mp4 -o my_recording_subs.vtt -m mediumThe Makefile inside videospeeder_project provides shortcuts for common tasks (run from within videospeeder_project directory):
make install: Install dependencies.make test: Run a test using a sample file path (edit the path in the Makefile first!) with GPU and indicator.make test-segment: Run a test on a segment.make transcribe INPUT=video.mp4 OUTPUT=subs.vtt [MODEL=large]: Transcribe a file.make transcript-segment: Transcribe the output ofmake test-segment.make clean: Remove generated.mp4files from the directory.make help: Show available make commands.
This project is licensed under the MIT License. See the LICENSE file for details.
Found a bug or have an idea? Feel free to open an issue on the GitHub repository!