【ICML 2025 Spotlight】 Official Repo for Paper ‘’HealthGPT : A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation‘’

Python 1,574 234 Updated Nov 2, 2025

StanfordMIMI / Merlin

Merlin is a 3D VLM for computed tomography that leverages both structured electronic health records (EHR) and unstructured radiology reports for pretraining.

Python 173 17 Updated Oct 22, 2025

facebookresearch / sam3

The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…

Python 6,247 726 Updated Dec 11, 2025

MIC-DKFZ / VoxTell

Free-Text Promptable Universal 3D Medical Image Segmentation

Python 56 1 Updated Dec 17, 2025

xiaoman-zhang / KAD

Python 153 11 Updated Aug 29, 2024

rajpurkarlab / CheXzero

This repository contains code to train a self-supervised learning model on chest X-ray images that lack explicit annotations and evaluate this model's performance on pathology-classification tasks.

Python 214 47 Updated Aug 28, 2023

swiftbar / SwiftBar

Powerful macOS menu bar customization tool

Swift 3,598 108 Updated Nov 7, 2025

meituan-longcat / LongCat-Video

Python 1,541 199 Updated Dec 20, 2025

RainBowLuoCS / DEEM

(ICLR 2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.

Python 44 5 Updated Jul 1, 2025

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,194 2,684 Updated Aug 12, 2024

facebookresearch / dinov3

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 8,870 654 Updated Nov 20, 2025

cvlab-kaist / VIRAL

Official implementation of "VIRAL: Visual Representation Alignment for MLLMs".

Python 140 8 Updated Sep 21, 2025

deepseek-ai / DeepSeek-OCR

Contexts Optical Compression

Python 21,509 1,925 Updated Oct 25, 2025

mashijie1028 / GenHancer

(ICCV 2025) Enhance CLIP and MLLM's fine-grained visual representations with generative models.

Python 74 4 Updated Jun 25, 2025

microsoft / VibeVoice

Open-Source Frontier Voice AI

Python 18,756 2,073 Updated Dec 17, 2025

StanfordMIMI / RoentGen-v2

Improving Performance, Robustness, and Fairness of Radiographic AI Models with Finely-Controllable Synthetic Data

Python 17 1 Updated Sep 12, 2025

QwenLM / Qwen-Image

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 6,438 361 Updated Dec 19, 2025

wanghao9610 / X-SAM

[AAAI2026] X-SAM: From Segment Anything to Any Segmentation

Python 334 10 Updated Nov 28, 2025

liaolea / TransPrune

7 1 Updated Jul 29, 2025

Sliver-g / Cardiac-CLIP

Python 18 1 Updated Nov 28, 2025

open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark

Python 32,190 9,831 Updated Aug 21, 2024

jd-opensource / joyagent-jdgenie

开源的端到端产品级通用智能体

Java 11,389 1,415 Updated Dec 16, 2025

Liuziyu77 / Visual-RFT

Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’

Jupyter Notebook 2,286 103 Updated Oct 29, 2025

coze-dev / coze-studio

An AI agent development platform with all-in-one visual tools, simplifying agent creation, debugging, and deployment like never before. Coze your way to AI Agent creation.

TypeScript 19,075 2,696 Updated Dec 18, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 65,847 12,099 Updated Dec 21, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 11,758 1,071 Updated Dec 21, 2025

Zhixuan CHEN zhi-xuan-chen

Highlights

Lists (14)

Agent

📚 Course

CT Tools

Dataset Construction

Detection

Gen4Und

Image Generation

LLM

Mac

🚀 My stack

RAG

RL

Segmentation

Video Generation

Stars