Stars
Official implementation of the paper “Endowing Vision-Language Models with System 2 Thinking for Fine-Grained Visual Recognition,” AAAI 2026.
Code for the paper "Conditional Representation Learning for Customized Tasks" (NeurIPS 2025 Spotlight)
An official implementation of "SIM-CoT: Supervised Implicit Chain-of-Thought"
VisRet: Visualization Improves Knowledge-Intensive Text-to-Image Retrieval
Pytorch Implementation of LLaVA-ReID: Selective Multi-image Questioner for Interactive Person Re-Identification
verl: Volcano Engine Reinforcement Learning for LLMs
Official Implementation of Visual Abstraction: A Plug-and-Play Approach for Text-Visual Retrieval
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.
[ICML 2025 Oral] An official implementation of VideoRoPE & VideoRoPE++
Multi-granularity Correspondence Learning from Long-term Noisy Videos [ICLR 2024, Oral]
This is a summary of research on All-In-One Image/Video Restoration. There may be omissions. If anything is missing please get in touch with us. Our emails: liboyun.gm@gmail.com; gouyuanbiao@gmail.…
This is a summary of research on noisy correspondence. There may be omissions. If anything is missing please get in touch with us. Our emails: linyijie.gm@gmail.com yangmouxing@gmail.com qinyang.gm…