-
Nanjing University of Science and Technology
- Nanjing, China
- rayn-wu.github.io/
Lists (12)
Sort Name ascending (A-Z)
Stars
《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。
Deep Learning papers reading roadmap for anyone who are eager to learn this amazing tech!
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Image-to-Image Translation in PyTorch
An open source implementation of CLIP.
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
A collaboration friendly studio for NeRFs
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
Pretrained Pytorch face detection (MTCNN) and facial recognition (InceptionResnet) models
[CVPR 2023 Best Paper Award] Planning-oriented Autonomous Driving
CUDA accelerated rasterization of gaussian splatting
Transformer: PyTorch Implementation of "Attention Is All You Need"
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
Scenic: A Jax Library for Computer Vision Research and Beyond
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
Papers and Datasets about Point Cloud.
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
[CVPR 2025 Best Paper Nomination] FoundationStereo: Zero-Shot Stereo Matching