-
Shanghai AI Lab
- Shanghai, China
- @Haoyu__Guo
Lists (24)
Sort Name ascending (A-Z)
2DV
3D segmentation
3DV
4D
Acceleration / Compression
Datasets
Experience
Framework
GAN
Generation
Human
Indoor
Inverse rendering
Learning
MVS / Stereo matching
NLP
Other
Representation
Review / Survey
RL
SfM / SLAM
Surface reconstruction
Tools
View synthesis
Stars
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Learn OpenCV : C++ and Python Examples
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Convert AI papers to GUI,Make it easy and convenient for everyone to use artificial intelligence technology。让每个人都简单方便的使用前沿人工智能技术
Code release for NeRF (Neural Radiance Fields)
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
Code for "LoFTR: Detector-Free Local Feature Matching with Transformers", CVPR 2021, T-PAMI 2022
A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
Instant-ngp in pytorch+cuda trained with pytorch-lightning (high quality with high speed, with only few lines of legible code)
Code for "GVHMR: World-Grounded Human Motion Recovery via Gravity-View Coordinates", Siggraph Asia 2024
Code for "Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed", CVPR 2024
[CVPR 2024 Oral] Rethinking Inductive Biases for Surface Normal Estimation
An unofficial implementation of paper 3D Gaussian Splatting for Real-Time Radiance Field Rendering by taichi lang.
Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)
Data, tools, and documentation of the Fusion 360 Gallery Dataset
Code for "Fast and Robust Multi-Person 3D Pose Estimation from Multiple Views" (CVPR 2019, T-PAMI 2021)
A Scalable Pipeline for Making Steerable Multi-Task Mid-Level Vision Datasets from 3D Scans [ICCV 2021]
Notebook example of how to generate class visualizations with Caffe
Official implementation of VaxNeRF (Voxel-Accelearated NeRF).
Code for Ditto: Building Digital Twins of Articulated Objects from Interaction