A novel lossless video compression method based on rational Bloom filters that achieves significant space savings while guaranteeing perfect bit-exact reconstruction.
This project implements a lossless video compression scheme using rational Bloom filters - a probabilistic data structure that allows for efficient representation of binary data. The key innovation is the use of non-integer (rational) hash functions in the Bloom filter, which theoretically enables better compression than traditional methods.
The compression system targets raw video content (Y4M, YUV, HDR, etc.) and provides:
- True lossless compression with bit-exact reconstruction
- Space savings of 40-50% on typical video content
- Efficient encoding and decoding with multi-threaded support
- Support for various color spaces (RGB, BGR, YUV)
- Handling of high dynamic range (HDR) content(This needs some work to make it fast and usable)
- Python 3.7+
- Required packages:
- numpy
- opencv-python
- matplotlib
- pandas
- tqdm
- requests
- xxhash
- Pillow
- scikit-image
- pyexr (for HDR support)
Install all dependencies with:
pip install -r requirements.txtfrom improved_video_compressor import ImprovedVideoCompressor
# Initialize compressor
compressor = ImprovedVideoCompressor(
noise_tolerance=10.0,
keyframe_interval=30,
use_direct_yuv=True,
verbose=True
)
# Compress a video
compressor.compress_video(
input_file="input_video.y4m",
output_file="compressed.bfvc"
)
# Decompress a video
compressor.decompress_video(
input_file="compressed.bfvc",
output_file="decompressed.mp4"
)
# Verify lossless decompression
original_frames = compressor.extract_frames_from_video("input_video.y4m")
decompressed_frames = compressor.decompress_video("compressed.bfvc")
verification = compressor.verify_lossless(original_frames, decompressed_frames)
print(f"Lossless: {verification['lossless']}")# Compress a video
python -m improved_video_compressor compress input_video.y4m output.bfvc --max-frames 30
# Decompress a video
python -m improved_video_compressor decompress output.bfvc decompressed.mp4
# Process raw YUV file
python -m improved_video_compressor process-yuv input.yuv output.bfvc --width 1920 --height 1080 --format YUV444The project includes a comprehensive benchmarking system that compares the Rational Bloom Filter compression with other lossless compression methods like FFV1, HuffYUV, and H.264 (lossless mode).
# Run the benchmark
python benchmark_compression.py
# Run benchmark with specific datasets and methods
python benchmark_compression.py --datasets y4m --methods bloom ffv1 --max-frames 10See results.md for detailed benchmark results and instructions on how to reproduce them.
The compression scheme works through the following steps:
- Frame Extraction: Extract frames from the input video
- Keyframe Selection: Store keyframes as direct zlib-compressed frames
- Bloom Filter Compression: For inter-frames, compress difference maps using rational Bloom filters
- Lossless Verification: Verify bit-exact reconstruction during decompression
The rational Bloom filter uses a non-integer number of hash functions (k*) to optimize the space-accuracy tradeoff. This is implemented by using ⌊k*⌋ hash functions deterministically, plus an additional hash function applied with probability (k* - ⌊k*⌋).
improved_video_compressor.py- Main implementation of the compression algorithmverify_true_lossless.py- Script to verify lossless reconstructionbenchmark_compression.py- Benchmark system comparing different methodsdownload_*.py- Scripts to download test datasetsresults.md- Detailed benchmark results and analysis
This project is licensed under the MIT License - see the LICENSE file for details.
If you use this code in your research, please cite:
@misc{rationalbloom2023,
author = {Author},
title = {Rational Bloom Filter Video Compression},
year = {2023},
publisher = {GitHub},
url = {https://github.com/username/rational-bloom-filter-compression}
}