This project was inspired by Daft's blog post: Processing 300K Images Without OOM
Streaming image processing in Rust. Compare naive, batched, and streaming pipelines while tracking time and memory.
Project writeup: Flux: Image Processing Without OOM
- Rust 1.70+ (install via rustup)
git clone https://github.com/AarjavPatni/flux.git
cd flux
cargo build --release- Downloads images from Lorem Picsum
- Processes images via three approaches: naive, batched, streaming
- Tracks time, memory, and throughput
- Outputs a comparison table
Naive (src/naive + src/image_processor)
Batched (src/batched)
Streaming (src/streaming)
# Run all pipelines with specified image count (default: 200)
cargo run --release -- <image_count>
# Examples
RUST_LOG=info cargo run --release -- 1000
RUST_LOG=info cargo run --release -- 20Logging levels: Set RUST_LOG=info for summaries or RUST_LOG=debug for per-image details.
Outputs are written to:
data/processed/naive
data/processed/batched
data/processed/streaming
Test machine
- MacBook Pro (14-inch, 2023)
- Apple M2 Pro (10-core), 16 GB RAM
- macOS Sequoia 15.7.3 (arm64)
| Approach | Images | Time (ms) | Peak Mem (MB) | Avg DL (ms) | Avg Resize (ms) | Throughput (img/s) |
|---|---|---|---|---|---|---|
| naive | 1000 | 850341 | 91 | 302 | 415 | 1.18 |
| batched | 1000 | 146564 | 210 | 548 | 500 | 6.82 |
| streaming | 1000 | 77831 | 263 | 289 | 769 | 12.85 |
Speedups (time):
- Batched is 5.80x faster than naive
- Streaming is 10.93x faster than naive
- Streaming is 1.88x faster than batched
Throughput (img/s):
- Batched is 5.80x higher than naive
- Streaming is 10.93x higher than naive
- Streaming is 1.88x higher than batched
Peak memory (MB):
- Batched is 2.31x higher than naive
- Streaming is 2.89x higher than naive
- Streaming is 1.25x higher than batched
Memory usage results: Streaming shows higher peaks due to overlapping pipeline stages with data in-flight.
Concurrency parameters (for 1k run):
- Batched: batch size 10
- Streaming: download concurrency 20, process concurrency 4, channel capacity 50
src/
├── main.rs # CLI + pipeline runner
├── url_generator.rs # Lorem Picsum URLs
├── image_processor.rs # Single-image baseline
├── memory_monitor.rs # Process memory tracking
├── naive/ # Sequential pipeline
├── batched/ # Batched pipeline
└── streaming/ # Streaming pipeline (channels + backpressure)
- Network variability affects timings
- Picsum images are random but stable via seeds
- Streaming pipeline uses bounded channels for backpressure