Lists (1)
Sort Name ascending (A-Z)
Stars
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluation
✨✨Latest Advances on Multimodal Large Language Models
up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).
A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)
A curated list of resources dedicated to the safety of Large Vision-Language Models. This repository aligns with our survey titled A Survey of Safety on Large Vision-Language Models: Attacks, Defen…
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
🚀 LeetCode From Zero To One & 题单整理 & 题解分享 & 算法模板 & 刷题路线,持续更新中...
Code for paper "Membership Inference Attacks Against Vision-Language Models"
AnyDoor: Test-Time Backdoor Attacks on Multimodal Large Language Models
[NeurIPS'25] VLMs Can Aggregate Scattered Training Patches
Code for paper "The Philosopher’s Stone: Trojaning Plugins of Large Language Models"
[NeurIPS'22] EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records
This repository provides a benchmark for prompt injection attacks and defenses in LLMs
[USENIX Security'25] THEMIS: Towards Practical Intellectual Property Protection for Post-Deployment On-Device Deep Learning Models
This Github repository summarizes a list of research papers on AI security from the four top academic conferences.
Code and data for the paper: On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents
S-Eval: Towards Automated and Comprehensive Safety Evaluation for Large Language Models
Code and data for "ImgTrojan: Jailbreaking Vision-Language Models with ONE Image"
[CVPR 2023] "TrojViT: Trojan Insertion in Vision Transformers" by Mengxin Zheng, Qian Lou, Lei Jiang
Official implementation repository for the paper Towards General Conceptual Model Editing via Adversarial Representation Engineering.
LAVIS - A One-stop Library for Language-Vision Intelligence
Official codebase for Image Hijacks: Adversarial Images can Control Generative Models at Runtime
An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.