This is a forked version of the official SHARP repository, optimized specifically for local execution on Apple Silicon (M4 series and earlier) via Metal Performance Shaders (MPS). It introduces high-performance local rendering engines that bypass the need for CUDA on Mac. It requires more than a second to produce the final splats and rendered images, but it is still acceptable given that I only rely on Apple Silicon / MPS to process ~1M Gaussian splats.
- Splats prediction: ~7 seconds per image.
- Rendering: ~17 seconds per frame ("high quality" mode), ~0.5 seconds per frame ("fast" mode).
SHARP for Mac provides two distinct rendering modes to balance speed and fidelity.
| Input Image | Fast Mode (~0.5s/frame) | High Quality Mode (~17s/frame) |
|---|---|---|
| Original | Deep Cloud Splatting | Absolute Fidelity |
| Input Image | Fast Mode (~0.5s/frame) | High Quality Mode (~17s/frame) |
|---|---|---|
| Original | Deep Cloud Splatting | Absolute Fidelity |
We've included a Gradio-based web interface for a more interactive experience. You can upload an image and preview the 3D generation in real-time.
To run the UI:
python src/sharp/web_ui.pyThis software project accompanies the research paper: Sharp Monocular View Synthesis in Less Than a Second by Lars Mescheder, Wei Dong, Shiwei Li, Xuyang Bai, Marcel Santos, Peiyun Hu, Bruno Lecouat, Mingmin Zhen, Amaël Delaunoy, Tian Fang, Yanghai Tsin, Stephan Richter and Vladlen Koltun.
We present SHARP, an approach to photorealistic view synthesis from a single image. Given a single photograph, SHARP regresses the parameters of a 3D Gaussian representation of the depicted scene. This is done in less than a second on a standard GPU via a single feedforward pass through a neural network. The 3D Gaussian representation produced by SHARP can then be rendered in real time, yielding high-resolution photorealistic images for nearby views. The representation is metric, with absolute scale, supporting metric camera movements. Experimental results demonstrate that SHARP delivers robust zero-shot generalization across datasets. It sets a new state of the art on multiple datasets, reducing LPIPS by 25–34% and DISTS by 21–43% versus the best prior model, while lowering the synthesis time by three orders of magnitude.
We recommend to first create a python environment:
conda create -n sharp python=3.13
Afterwards, you can install the project using:
# For Linux/Windows (CUDA)
pip install -r requirements.txt
# For Mac (Apple Silicon/CPU)
pip install -r requirements_mps.txtTo test the installation, run
sharp --help
To run prediction (Device defaults to CUDA if available, then MPS, then CPU):
# Auto-detect best device
sharp predict -i /path/to/input/images -o /path/to/output/gaussians
# Explicitly use Apple Silicon with high quality rendering
sharp predict --device mps --quality high -i /path/to/input/images -o /path/to/output/gaussians --render
# Explicitly use CPU
sharp predict --device cpu -i /path/to/input/images -o /path/to/output/gaussiansThe model checkpoint will be downloaded automatically on first run and cached locally at ~/.cache/torch/hub/checkpoints/.
Alternatively, you can download the model directly:
wget https://ml-site.cdn-apple.com/models/sharp/sharp_2572gikvuh.pt
To use a manually downloaded checkpoint, specify it with the -c flag:
sharp predict -i /path/to/input/images -o /path/to/output/gaussians -c sharp_2572gikvuh.pt
The results will be 3D gaussian splats (3DGS) in the output folder. The 3DGS .ply files are compatible to various public 3DGS renderers. We follow the OpenCV coordinate convention (x right, y down, z forward). The 3DGS scene center is roughly at (0, 0, +z). When dealing with 3rdparty renderers, please scale and rotate to re-center the scene accordingly.
You can render high-resolution videos with a metric camera trajectory. This project provides optimized rendering paths for all major platforms:
- NVIDIA (CUDA): Uses optimized
gsplatkernels for maximum throughput. - Mac (Apple Silicon): Features a custom high-performance rendering engine optimized for M-series (including M-ultra) with two specialized modes:
--quality fast(Default): Uses Deep Cloud Splatting (~0.3-0.6s/frame). Optimized for maximum speed and visual density, perfect for rapid previews.--quality high: Uses a formal Kerbl-based 3DGS engine with Absolute Fidelity (~17-27s/frame). Implements precise perspective ellipsoid projection, EWA splatting, 2x2 Super-Sampling (SSAA), and a massive depth buffer (K=2048) in float32 to eliminate all black artifacts and micro-gaps for professional-grade visual fidelity.
- CPU: Universal fallback using the fast splatting logic.
To render during prediction:
# High-speed preview on Mac
sharp predict --device mps --quality fast -i /path/to/input/images -o /path/to/output/gaussians --render
# Professional quality render on Mac
sharp predict --device mps --quality high -i /path/to/input/images -o /path/to/output/gaussians --renderOr from the intermediate gaussians:
# Render smooth visuals on Mac
sharp render --device mps --quality high -i /path/to/output/gaussians -o /path/to/output/renderings_hq
# Rapid sequence rendering on Mac
sharp render --device mps --quality fast -i /path/to/output/gaussians -o /path/to/output/renderings_fastPlease refer to the paper for both quantitative and qualitative evaluations. Additionally, please check out this qualitative examples page containing several video comparisons against related work.
If you find our work useful, please cite the following paper:
@inproceedings{Sharp2025:arxiv,
title = {Sharp Monocular View Synthesis in Less Than a Second},
author = {Lars Mescheder and Wei Dong and Shiwei Li and Xuyang Bai and Marcel Santos and Peiyun Hu and Bruno Lecouat and Mingmin Zhen and Ama\"{e}l Delaunoy and Tian Fang and Yanghai Tsin and Stephan R. Richter and Vladlen Koltun},
journal = {arXiv preprint arXiv:2512.10685},
year = {2025},
url = {https://arxiv.org/abs/2512.10685},
}Our codebase is built using multiple opensource contributions, please see ACKNOWLEDGEMENTS for more details.
Please check out the repository LICENSE before using the provided code and LICENSE_MODEL for the released models. Updated and maintained at https://github.com/ghif/ml-sharp.