Lists (1)
Sort Name ascending (A-Z)
Stars
Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".
[ICLR 2025] See What You Are Told: Visual Attention Sink in Large Multimodal Models
Latent Collaboration in Multi-Agent Systems
Official code implementation for paper "PathAgent: Toward Interpretable Analysis of Whole-slide Pathology Images via Large Language Model-based Agentic Reasoning"
The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.
Official page for ICLR 2025 paper "Sufficient Context: A New Lens on Retrieval Augmented Generation Systems"
Recurrence Meets Transformers for Universal Multimodal Retrieval
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
[CVPR 2025] Code for "Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering".
This is the official repository for Retrieval Augmented Visual Question Answering
Computational Pathology Toolbox developed by TIA Centre, University of Warwick.
The official repo for paper, LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods.
Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning"
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
"Context engineering is the delicate art and science of filling the context window with just the right information for the next step." — Andrej Karpathy. A frontier, first-principles handbook inspi…
Implementation Code for "LLM-based Medical Assistant Personalization with Short- and Long-Term Memory Coordination"
Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents
[ACM MM 2025 🔥🔥 ] MIRA: A first-of-its-kind medical RAG framework that fuses image features and retrieved knowledge with dynamic context control to boost factual accuracy in multimodal medical reas…
TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients. Published in Nature.
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
Nexent is a zero-code platform for auto-generating agents — no orchestration, no complex drag-and-drop required. Nexent also offers powerful capabilities for agent running control, data processing …
The official implementation of paper "PTCMIL: Multiple Instance Learning via Prompt Token Clustering for Whole Slide Image Analysis" accepted at MICCAI 2025
Toolkit for large-scale whole-slide image processing.