Brioche is a Rust implementation of the ml-depth-pro repository which re-implements the depth-pro neural network model using Burn. It uses the ONNX to load the underlying vit_large_patch14_dinov2 model and weights which are exported from python. The exported python script is located in the butter folder.
Note
1st image is the original image which is use to perfrom the inference 2nd image is the inference using w/o quantization 3rd image is the result with quantization
- You'll need to have the latest version of Rust installed.
- You'll need to have the uv package manager installed. Once you have it installed, you can install the required dependencies by running the following command in the butter folder:
uv sync- Download the depth-pro checkpoint file by running the following command in the butter folder:
uv run vit_exporter.py --download-checkpoint --checkpoint-path ./depth_pro.pt- Export the ONNX model & weights by running the following script:
uv run vit_exporter.py --checkpoint-path ./depth_pro.pt- Export the weights for the network model by running the following command:
uv run state_exporter.py --checkpoint-path ./depth_pro.pt- Once you have all these files you should be able to run the sample using this command from the root folder of this repository.
cargo run --release --example sample_metal --features="metal"The project supports half precision (f16) for the network model. In order to use it. You'll first need to export the ONNX model & weights for half precision by running the following command:
uv run vit_exporter.py --checkpoint-path ./depth_pro.pt --halfOnce it's done. You can use the following command to run the sample which uses the half precision (f16), by running the following command:
cargo run --example sample_metal --features="metal,f16,ort_onnx" --profile release-lto --no-default-featurescargo run --release --example sample_metal --features="metal,f32,burn_onnx" --no-default-featuresA quantize model has been generated by using onnx. This model is lighter and allows to perform inference quicker. To run the quantize model you can run the following commands
- Export the quantize model
cd butter && uv run vit_exporter.py --checkpoint-path ./depth_pro.pt --quantize- Run the inference
cargo run --release --example sample_metal --features="metal,f32" -- --quantizeThis project serves as an educational purpose. There's an overhead when converting a CPU tensor from ONNX to GPU tensor when using the ORT library. This overhead is not ideal for production use. Below is a performance table (Burn onnx seems to be faster than ORT on GPU)
| Device | Precision | Time (s) |
|---|---|---|
| MacBook Pro M1 Pro | Full Precision (ort) | 125 |
| MacBook Pro M1 Pro | Half Precision (ort) | 141 |
| MacBook Pro M1 Pro | Half Precision (burn) | 122 |
| MacBook Pro M1 Pro | Full Precision (burn) | 123 |
| MacBook Pro M1 Pro | Quantize model (ort) | 107 |
| MacBook Pro M4 Pro | Full Precision (ort) | 32 |
| MacBook Pro M4 Pro | Quantize model (ort) | 30 |
| MacBook Pro M4 Pro | Half Precision (ort) | 35 |
Credits to the depth-pro team for open-sourcing their work.
@inproceedings{Bochkovskii2024:arxiv,
author = {Aleksei Bochkovskii and Ama\"{e}l Delaunoy and Hugo Germain and Marcel Santos and
Yichao Zhou and Stephan R. Richter and Vladlen Koltun},
title = {Depth Pro: Sharp Monocular Metric Depth in Less Than a Second},
booktitle = {International Conference on Learning Representations},
year = {2025},
url = {https://arxiv.org/abs/2410.02073},
}