-
HKUST
Highlights
- Pro
Lists (14)
Sort Name ascending (A-Z)
Stars
An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone
a multiscale multimodal large language models for radiology report generation (RRG) tasks
【ICML 2025 Spotlight】 Official Repo for Paper ‘’HealthGPT : A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation‘’
Merlin is a 3D VLM for computed tomography that leverages both structured electronic health records (EHR) and unstructured radiology reports for pretraining.
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
Free-Text Promptable Universal 3D Medical Image Segmentation
This repository contains code to train a self-supervised learning model on chest X-ray images that lack explicit annotations and evaluate this model's performance on pathology-classification tasks.
(ICLR 2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Reference PyTorch implementation and models for DINOv3
Official implementation of "VIRAL: Visual Representation Alignment for MLLMs".
(ICCV 2025) Enhance CLIP and MLLM's fine-grained visual representations with generative models.
Improving Performance, Robustness, and Fairness of Radiographic AI Models with Finely-Controllable Synthetic Data
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
[AAAI2026] X-SAM: From Segment Anything to Any Segmentation
OpenMMLab Detection Toolbox and Benchmark
Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’
An AI agent development platform with all-in-one visual tools, simplifying agent creation, debugging, and deployment like never before. Coze your way to AI Agent creation.
A high-throughput and memory-efficient inference and serving engine for LLMs
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …