Starred repositories
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
A geometry-shader-based, global CUDA sorted high-performance 3D Gaussian Splatting rasterizer. Can achieve a 5-10x speedup in rendering compared to the vanialla diff-gaussian-rasterization.
[TIP 2025] ADStereo: Efficient Stereo Matching with Adaptive Downsampling and Disparity Alignment
ONNX-compatible Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
A comprehensive list of Implicit Representations, NeRF and 3D Gaussian Splatting papers relating to SLAM/Robotics domain, including papers, videos, codes, and related websites
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
A complete computer science study plan to become a software engineer.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
Stereo4D dataset and processing code
[NeurIPS'23] Hierarchical Integration Diffusion Model for Realistic Image Deblurring
A curated list of awesome advice for computer science Ph.D. applicants.
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
[CVPR 2025] Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
[CVPR 2025] StdGEN: Semantic-Decomposed 3D Character Generation from Single Images
A list of popular deep learning models related to classification, segmentation and detection problems
Official Repo for Deep Learning for Compyter Vision Course offered by NPTEL
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
List of Research Internships for Undergraduate Students
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
This project is dedicated to the implementation and research of Kolmogorov-Arnold convolutional networks. The repository includes implementations of 1D, 2D, and 3D convolutions with different kern…
Collection of common code that's shared among different research projects in FAIR computer vision team.
This repository contains demos I made with the Transformers library by HuggingFace.
Compendium of free ML reading resources
Official Pytorch implementations for "SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation" (NeurIPS 2022)