Stars
Official code release of "CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition"
[CVPR 2025 Highlight] Official code repository for "Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning"
[AAAI-2025] The official code of Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation
[ICCV-2023] The official code of Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation
Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
[ICML2025 Oral] ReferSplat: Referring Segmentation in 3D Gaussian Splatting
[ECCV 2024] The offical implementation of paper 3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
[NeurIPS 2023] Weakly Supervised 3D Open-vocabulary Segmentation
📚 A collection of papers about Referring Image Segmentation.
[ICCV 2023] Distilling Coarse-to-fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
[ICCV 2025] Task-Specific Zero-shot Quantization-Aware Training for Object Detection
GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models
[CVPR 2025] Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model
Code for the ECCV22 paper "Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds"
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
[CVPR 2023] EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
[ECCV2022] D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
This is a PyTorch implementation of 3DGCTR proposed by our paper “Rethinking 3D Dense Caption and Visual Grounding in A Unified Framework through Prompt-based Localization”
[ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities
[CVPR 2025, All Strong Accept] TSP3D: Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding
awesome grounding: A curated list of research papers in visual grounding
[CVPR'25] SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
😎 up-to-date & curated list of awesome 3D Visual Grounding papers, methods & resources.
[ECCV 2024] Pseudo-Embedding for Generalized Few-Shot 3D Segmentation
[ICLR 2025 Spotlight] Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation
[CVPR 2023] Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders