Skip to content
View LinLLLL's full-sized avatar

Block or report LinLLLL

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[ACL 2024] Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding

Python 15 3 Updated Nov 10, 2025

(CVPR 2025 highlight✨) Official repository of paper "LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models"

Python 525 29 Updated Dec 18, 2025

Benchmarking Generalized Out-of-Distribution Detection

Python 1,024 163 Updated Dec 1, 2025

The official implementation of Delta Energy: Optimizing Energy Change During Vision-Language Alignment Improves both OOD Detection and OOD Generalization (NeurIPS2025)

Python 3 Updated Oct 13, 2025

The official implementation of InfoBound: A Provable Information-Bounds Inspired Framework for Both OoD Generalization and OoD Detection (T-PAMI 2025)

Python 2 Updated Oct 13, 2025

[ICLR2025] The official implementation of Less is More: Masking Elements in Image Condition Features Avoids Content Leakages in Style Transfer Diffusion Models

Python 16 Updated Jul 3, 2025

(CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning

Python 54 5 Updated Aug 16, 2024

[NeurIPS2023] LoCoOp: Few-Shot Out-of-Distribution Detection via Prompt Learning

Python 102 4 Updated Jul 5, 2025

This repo contains the code for the paper "Understanding and Mitigating Hallucinations in Large Vision-Language Models via Modular Attribution and Intervention, ICLR 2025".

Python 30 Updated Jul 14, 2025

[ICLR 2025] Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality

Python 59 2 Updated Jul 5, 2025

🔥 [NeurIPS 2025] Official implementation of "Generate, but Verify: Reducing Visual Hallucination in Vision-Language Models with Retrospective Resampling (REVERSE)"

Python 49 5 Updated Sep 18, 2025

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 8,896 655 Updated Nov 20, 2025

The example of correspondence between fine classes and superclasses (coarse classes) in ImageNet.

Python 13 3 Updated Dec 4, 2024

Adaptation of vision-language models (CLIP) to downstream tasks using local and global prompts.

Python 50 3 Updated Jul 10, 2025

Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning

Python 131 7 Updated Jun 30, 2025

[CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension

Python 60 7 Updated Apr 8, 2024

Transferable Decoding with Visual Entities for Zero-Shot Image Captioning, ICCV 2023

Python 161 5 Updated Sep 9, 2024

Official Code for "Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning" (ICLR 2025)

Python 12 1 Updated Mar 6, 2025

Data release for the ImageInWords (IIW) paper.

JavaScript 223 7 Updated Nov 17, 2024

[Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

Python 414 21 Updated Dec 22, 2024

[ICCV 2025] Implementation for Describe Anything: Detailed Localized Image and Video Captioning

Python 1,435 85 Updated Jun 26, 2025

[NeurIPS 2024] Conjugated Semantic Pool Improves OOD Detection with Pre-trained Vision-Language Models

Python 39 1 Updated Oct 17, 2024

[ICLR 2024 Spotlight] "Negative Label Guided OOD Detection with Pretrained Vision-Language Models"

Python 29 5 Updated Oct 23, 2024

PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)

Python 246 27 Updated Jun 10, 2025

Solve Visual Understanding with Reinforced VLMs

Python 5,770 375 Updated Oct 21, 2025

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 19,999 1,671 Updated Nov 26, 2025

[ICLR 2024] Test-Time RL with CLIP Feedback for Vision-Language Models.

Python 95 2 Updated Oct 20, 2025

[CVPR2025] The implementation of the paper "OODD: Test-time Out-of-Distribution Detection with Dynamic Dictionary".

Python 18 2 Updated May 9, 2025

[ICCV 2025] VisRL: Intention-Driven Visual Perception via Reinforced Reasoning

Python 41 3 Updated Nov 8, 2025
Next