Skip to content
View lifeGWT's full-sized avatar

Block or report lifeGWT

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[CVPR 2026] This repository is the official implementation of MVGGT: Multimodal Visual Geometry Grounded Transformer for Multiview 3D Referring Expression Segmentation

Python 112 1 Updated Mar 24, 2026
Python 5 Updated Dec 25, 2025
Python 43 3 Updated Jan 1, 2026

[ICML2025 Oral] ReferSplat: Referring Segmentation in 3D Gaussian Splatting

Jupyter Notebook 146 8 Updated Sep 16, 2025

A Survey on 3D Gaussian Splatting Applications: Segmentation, Editing, and Generation

338 15 Updated Apr 11, 2026

Official repository for the AAAI2026 paper (Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach)

Python 27 2 Updated Apr 24, 2026

[ICLR 2026] Official implementation of JavisDiT and JavisDiT++ series.

Python 364 29 Updated Mar 29, 2026

[Lumina具身智能社区] 具身智能技术指南 Embodied-AI-Guide

13,282 857 Updated Mar 12, 2026

[CVPR'26] PE3R: Perception-Efficient 3D Reconstruction. Take 2 - 3 photos with your phone, upload them, wait a few minutes, and then start exploring your 3D world via text!

Python 406 17 Updated Feb 28, 2026

✨✨[AAAI 2026] This is the official implementation of our paper "QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension"

Python 78 2 Updated Apr 28, 2025

Official Repo For Pixel-LLM Codebase: Sa2VA (Arxiv-25), SAMTok (CVPR-26), VRT, SaSaSa2VA (1-st solution for LSVOS)

Python 1,593 117 Updated Apr 29, 2026

[ACL 2025] PruneVid: Visual Token Pruning for Efficient Video Large Language Models

Python 71 1 Updated May 15, 2025

【AAAI2025】MambaPro: Multi-Modal Object Re-Identification with Mamba Aggregation and Synergistic Prompt

Python 87 5 Updated May 13, 2025

【AAAI2025】DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification

Python 71 5 Updated Mar 9, 2025

[NeurIPS'24] I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing

Jupyter Notebook 33 Updated Dec 9, 2025

✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension"

Python 422 39 Updated Jan 14, 2026

TraDiffusion: Trajectory-Based Training-Free Image Generation

Python 54 3 Updated Nov 10, 2024

[ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities

Python 83 3 Updated Oct 10, 2024

INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model

Python 42 Updated Aug 4, 2024

[NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'

Python 206 6 Updated Jul 17, 2025

《动手学大模型Dive into LLMs》系列编程实践教程

Jupyter Notebook 35,170 4,312 Updated Oct 10, 2025

PyTorch implementation of the paper `Toward Open-set Human Object Interaction Detection' (AAAI2024)

Python 6 1 Updated Feb 8, 2025

Official Implementation of SnAG (CVPR 2024)

Python 59 5 Updated Apr 26, 2025

[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".

Python 296 13 Updated Jun 13, 2024

[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"

Python 2,830 145 Updated Jul 10, 2025
Next