Stars
[TPAMI 2025] Towards Visual Grounding: A Survey
Hackable and optimized Transformers building blocks, supporting a composable construction.
Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development
📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程
The repo is finally unlocked. enjoy the party! The fastest repo in history to surpass 100K stars ⭐. Join Discord: https://discord.gg/5TUQKqFWd Built in Rust using oh-my-codex.
[ICLR'25] MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models
Sepsis Prediction using eICU database
Study on mechanical power in MIMIC-III and eICU-CRD
Tool for robust segmentation of >100 important anatomical structures in CT and MR images
[CVPR 2024] FairCLIP: Harnessing Fairness in Vision-Language Learning
Collection of awesome medical dataset resources.
A repo lists papers related to LLM based agent
This project lists the files related to LLM_based AI agents.
An agentic RL framework to enhance retreival-augmented reasoning in Diagnostic Policy
Benchmarking Reinforcement Learning Algorithms for ICU Ventilator Settings: A Patient Environment for Doctor Agents
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Segment Anything Model for Medical Image Segmentation: Open-Source Project Summary
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
MIMIC Code Repository: Code shared by the research community for the MIMIC family of databases
Python package to calculate comorbidity scores including Charlson Comorbidity Score and Elixhauser Score and their weighted variants.
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, B…
The official repository of the paper 'Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine'
Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline
Deep Unfolding Network for Image Super-Resolution (CVPR, 2020) (PyTorch)