Intelligent GPU-accelerated frame extraction for presentation and lecture videos
Smart Extract is a high-performance Python tool that automatically finds and exports the best possible still frames from long presentation recordings such as talks, lectures, demos, and keynotes.
Instead of dumping thousands of near-identical frames, Smart Extract scans the entire video, scores visual quality, enforces temporal diversity, and optionally applies AI super-resolution to produce a curated, portfolio-ready image set.
Most frame-grab tools blindly sample frames by time.
Smart Extract understands quality.
It evaluates every candidate frame and keeps only the best moments.
-
π§ Global Best-Frame Selection
Scores frames by sharpness + brightness across the entire video -
β± Temporal Diversity Enforcement
Prevents near-duplicate frames from the same moment -
β‘ GPU-Accelerated Scoring (Optional)
Uses PyTorch and CUDA for fast analysis on RTX GPUs -
πΌ AI Super-Resolution (4x)
Optional RealESRGAN upscaling for ultra-clean slides and thumbnails -
β Auto-Crop for Presentation Screens
Removes dead space and zooms into the slide content -
π¨ Optional Color Normalization
Corrects projector color casts such as purple or yellow shifts -
π Fast Video Decoding
Uses Decord when available with OpenCV fallback
- Scan the entire video at a configurable low FPS
- Score each sampled frame using:
- Laplacian sharpness
- Mean brightness
- Rank all frames globally by visual quality
- Select the top K frames while enforcing a minimum time gap
- Post-process selected frames:
- Optional crop
- Optional color correction
- Optional AI upscaling
- Save only the final best frames
Result: A clean, diverse, high-quality image set instead of noise.
flowchart TD
A[Input Video File] --> B[Frame Sampling]
B --> C[Sharpness Scoring]
B --> D[Brightness Scoring]
C --> E[Global Frame Ranking]
D --> E[Global Frame Ranking]
E --> F[Temporal Diversity Filter]
F --> G[Best Frame Selection]
G --> H[Optional Auto Crop]
H --> I[Optional Color Correction]
I --> J[Optional AI Super Resolution]
J --> K[Final High Quality Frames]
- π€ Conference talks and keynote recordings
- π University lectures and classes
- π§βπ« Internal presentations and demos
- π Slide reconstruction from recorded sessions
- πΌ Thumbnail generation
- π Portfolio or documentation screenshots
- Python 3.9 or higher
- Windows, Linux, or macOS
- NVIDIA GPU with CUDA support
- RTX series GPU for best performance
Clone the repository:
git clone https://github.com/EyalPasha/smart-extract.git
cd smart-extractInstall dependencies:
pip install -r requirements.txtThis installs:
- torch (CUDA-enabled if available)
- opencv-python
- numpy
- decord for fast video decoding
- realesrgan and basicsr for AI upscaling
All behavior is controlled via config.py with no code changes required.
VIDEO_PATH = "IMG_8313.MOV"
OUTPUT_DIR = "photos"
OUTPUT_FORMAT = "PNG"USE_GLOBAL_BEST_FRAMES = True
GLOBAL_TARGET_FRAMES = 100
GLOBAL_SAMPLE_FPS = 2.0
GLOBAL_MIN_TIME_BETWEEN = 8.0
GLOBAL_START_TIME_SECONDS = 27.0This mode:
- Scans the entire video
- Picks the top K frames globally
- Ensures visual and temporal diversity
Output directory:
photos/best
USE_GPU_SCORING = True
ENABLE_SUPER_RESOLUTION = True- GPU scoring accelerates sharpness and brightness evaluation
- RealESRGAN upscales selected frames by 4x for maximum clarity
Disabled by default for maximum fidelity:
ENABLE_AUTO_CROP = False
ENABLE_COLOR_CORRECTION = FalseEnable these options if your recording contains:
- Excess dead space
- Projector color casts
python smart_extract.pyYou will see:
- Device detection (CPU or CUDA)
- Global scan progress bar
- Final selection summary
Example output:
Done! Saved 100 global best frames to photos/best
Each saved frame includes metadata in the filename:
best_042_t318.4s_score912.png
042is the rank318.4sis the timestampscore912is the final quality score
Perfect for sorting, filtering, and automation.
- If frames look too similar, increase
GLOBAL_MIN_TIME_BETWEEN - If key moments are missing, increase
GLOBAL_SAMPLE_FPS - If VRAM is limited, reduce the RealESRGAN tile size in
smart_extract.py
Smart Extract is a production-ready visual curation pipeline.
It demonstrates:
- Computer vision fundamentals
- GPU acceleration with PyTorch
- Practical AI upscaling
- Performance-aware video processing
- Clean and configurable system design
Ideal for portfolio showcase, research tooling, or real-world media workflows.
MIT License
β If this project helped you, consider starring the repository.