Stars
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
OpenMMLab Detection Toolbox and Benchmark
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
Wan: Open and Advanced Large-Scale Video Generative Models
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).
🐍 Geometric Computer Vision Library for Spatial AI
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting
Infinite Photorealistic Worlds using Procedural Generation
Official repo for consistency models.
OpenMMLab's next-generation platform for general 3D object detection.
Python module that makes working with XML feel like you are working with JSON
OpenPCDet Toolbox for LiDAR-based 3D Object Detection.
Single Image to 3D using Cross-Domain Diffusion for 3D Generation
A PyTorch Library for Accelerating 3D Deep Learning Research
A PyTorch native platform for training generative AI models
[ICLR 2024 Oral] Generative Gaussian Splatting for Efficient 3D Content Creation
Release for Improved Denoising Diffusion Probabilistic Models
[ICLR 2024] Official implementation of DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Plenoxels: Radiance Fields without Neural Networks
Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".
Lumina-T2X is a unified framework for Text to Any Modality Generation