CVPR-MIA

Recent papers about medical images published on CVPR. [Github]

🌟🌟🌟To complement or correct it (highlight, oral, and so on), please contact me at 1729766533 [at] qq [dot] com or send a pull request.

Last updated: 2025/06/20

CVPR2025

Image Generation (图像生成)

Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis. [Paper][Code]
Blood Flow Speed Estimation with Optical Coherence Tomography Angiography Images. [Paper][Code]
ZoomLDM: Latent Diffusion Model for multi-scale image generation. [Paper][Code]

Image Segmentation (图像分割)

nnWNet: Rethinking the Use of Transformers in Biomedical Image Segmentation and Calling for a Unified Evaluation Benchmark. [Paper][Code]
Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline. [Paper][Code]
Steady Progress Beats Stagnation: Mutual Aid of Foundation and Conventional Models in Mixed Domain Semi-Supervised Medical Image Segmentation. [Paper][Code]
DyCON: Dynamic Uncertainty-aware Consistency and Contrastive Learning for Semi-supervised Medical Image Segmentation. [Paper][Code]
LesionLocator: Zero-Shot Universal Tumor Segmentation and Tracking in 3D Whole-Body Imaging. [Paper][Code]
EffiDec3D: An Optimized Decoder for High-Performance and Efficient 3D Medical Image Segmentation. [Paper][Code]
nnWNet: Rethinking the Use of Transformers in Biomedical Image Segmentation and Calling for a Unified Evaluation Benchmark. [Paper][Code]
Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline. [Paper][Code]
Advancing Generalizable Tumor Segmentation with Anomaly.Aware Open-Vocabulary Attention Maps and Frozen FoundationDiffusion Models. [Paper][Code]
Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation. [Paper][Code]
Boost the Inference with Co-training: A Depth-guided Mutual Learning Framework for Semi-supervised Medical Polyp Segmentation (RD-Net). [Paper][Code]
Test-Time Domain Generalization via Universe Learning: A Multi-Graph Matching Approach for Medical Image Segmentation. [Paper][Code]

Medical Pre-training $ Foundation Model（预训练&基础模型）

Multi-modal Vision Pre-training for Medical Image Analysis. [Paper ][Code]
CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning [Paper ][Code]
EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance [Paper ][Code]

Vision-Language Model (视觉-语言)

VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge. [Paper ][Code]
BIOMEDICA: An Open Biomedical Image-Caption Archive with Vision-Language Models derived from Scientific Literature. [Paper ][Project]
BiomedCoOp: Learning to Prompt for Biomedical Vision-Language Models. [Paper ][Code]
MIMO: A medical vision language model with visual referring multimodal input and pixel grounding multimodal output. [Paper][Code]
Bringing CLIP to the Clinic: Dynamic Soft Labels and Negation-Aware Learning for Medical Analysis. [Paper][Code]
Alignment, Mining and Fusion: Representation Alignment with Hard Negative Mining and Selective Knowledge Fusion for Medical Visual Question Answering. [Paper][Code]
Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report Generation. [Paper ][Code]
FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification. [Paper][Code]
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Disrete Visual Representations. [Paper][Code]

Computational Pathology (计算病理)

Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance LearningComputational Pathology. [Paper][Code]
FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification. [Paper][Code][推送]
Distilled Prompt Learning for Incomplete Multimodal Survival Prediction. [Paper][Code]
Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning. [Paper][Code]
SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding. [Paper][Code]
2DMamba: Efficient State Space Model for Image Representation with Applications on Giga-Pixel Whole Slide Image Classification. [Paper][Code]
CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology. [Paper][Code]
MERGE: Multi-faceted Hierarchical Graph-based GNN for Gene Expression Prediction from Whole Slide Histopathology Images. [Paper][Code]
HistoFS: Non-IID Histopathologic Whole Slide Image Classification via Federated Style Transfer with RoI-Preserving. [Paper][Code]
M3amba: Memory Mamba is All You Need for Whole Slide Image Classification. [Paper][Code]
Advancing Multiple Instance Learning with Continual Learning for Whole Slide Imaging. [Paper][Code]
BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology. [Paper][Code]
Multi-Resolution Pathology-Language Pre-training Model with Text-Guided Visual Representation. [Paper][Code]
TopoCellGen: Generating Histopathology Cell Topology with a Diffusion Model. [Paper][Code]
Multi-modal Topology-embedded Graph Learning for Spatially Resolved Genes Prediction from Pathology Images with Prior Gene Similarity Information. [Paper][Code]
Robust Multimodal Survival Prediction with the Latent Differentiation Conditional Variational AutoEncoder. [Paper][Code]
MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification. [Paper][Code]
Learning Heterogeneous Tissues with Mixture of Experts for Gigapixel Whole Slide Images. [Paper][Code]
Unsupervised Foundation Model-Agnostic Slide-Level Representation Learning. [Paper][Code]
WISE: A Framework for Gigapixel Whole-Slide-Image Lossless Compression. [Paper][Code]

Others

Q-PART: Quasi-Periodic Adaptive Regression with Test-time Training for Pediatric Left Ventricular Ejection Fraction Regression.
Towards All-in-One Medical Image Re-Identification. [Paper][Code]
OpenMIBOOD: Open Medical Imaging Benchmarks for Out-Of-Distribution Detection. [Paper][Code]
MultiMorph: On-demand Atlas Construction. [Paper][Code]

CVPR2024

Image Reconstruction (图像重建)

QN-Mixer: A Quasi-Newton MLP-Mixer Model for Sparse-View CT Reconstruction. [Paper][Code][Project]
Fully Convolutional Slice-to-Volume Reconstruction for Single-Stack MRI. [Paper][Code]
Structure-Aware Sparse-View X-ray 3D Reconstruction.[Paper][Code]
Progressive Divide-and-Conquer via Subsampling Decomposition for Accelerated MRI. [Paper][Code]

Image Resolution (图像超分)

Learning Large-Factor EM Image Super-Resolution with Generative Priors. [Paper][Code][Video]
CycleINR: Cycle Implicit Neural Representation for Arbitrary-Scale Volumetric Super-Resolution of Medical Data. [Paper][Code]

Image Registration (图像配准)

Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration. [Paper]
[Oral & Best Paper Candidate!!!] Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration. [Paper][Code]

Image Segmentation (图像分割)

PrPSeg: Universal Proposition Learning for Panoramic Renal Pathology Segmentation. [Paper]
Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation. [Paper]
Each Test Image Deserves A Specific Prompt: Continual Test-Time Adaptation for 2D Medical Image Segmentation. [Paper][Code]
One-Prompt to Segment All Medical Images. [Paper][Code]
Modality-agnostic Domain Generalizable Medical Image Segmentation by Multi-Frequency in Multi-Scale Attention. [Paper][Code][Project]
Diversified and Personalized Multi-rater Medical Image Segmentation. [Paper][Code]
MAPSeg: Unified Unsupervised Domain Adaptation for Heterogeneous Medical Image Segmentation Based on 3D Masked Autoencoding and Pseudo-Labeling. [Paper][Code]
Adaptive Bidirectional Displacement for Semi-Supervised Medical Image Segmentation. [Paper][Code]
Cross-dimension Affinity Distillation for 3D EM Neuron Segmentation. [Paper][Code]
ToNNO: Tomographic Reconstruction of a Neural Network’s Output for Weakly Supervised Segmentation of 3D Medical Images.[Paper][Code]
Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation. [Paper][Code]
Teeth-SEG: An Efficient Instance Segmentation Framework for Orthodontic Treatment based on Anthropic Prior Knowledge. [Paper][Code]
Tyche: Stochastic in Context Learning for Universal Medical Image Segmentation. [Paper][Code]
Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation. [Paper][Code]
S2VNet: Universal Multi-Class Medical Image Segmentation via Clustering-based Slice-to-Volume Propagation. [Paper][Code]
EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation.[Paper][Code]
Training Like a Medical Resident: Context-Prior Learning Toward Universal Medical Image Segmentation.[Paper][Code]
ZePT: Zero-Shot Pan-Tumor Segmentation via Query-Disentangling and Self-Prompting. [Paper][Code]
[Oral!!!] Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration. [Paper][Code]
PH-Net: Semi-Supervised Breast Lesion Segmentation via Patch-wise Hardness. [Paper][Code][Video]

Image Generation (图像生成)

Learned representation-guided diffusion models for large-image generation. [Paper][Code]
MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant. [Paper]
Towards Generalizable Tumor Synthesis. [Paper][Code]
Data-Efficient Unsupervised Interpolation Without Any Intermediate Frame for 4D Medical Images. [Paper][Code]

Image Classification (图像分类)

Systematic comparison of semi-supervised and self-supervised learning for medical image classification. [Paper][Code]
Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images. [Paper][Code]

Federated Learning（联邦学习）

Think Twice Before Selection: Federated Evidential Active Learning for Medical Image Analysis with Domain Shifts. [Paper]

Medical Pre-training $ Foundation Model（预训练&基础模型）

VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis. [Paper][Code]
MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning. [Paper]
[Highlight!] Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning. [Paper][Code]
Bootstrapping Chest CT Image Understanding by Distilling Knowledge from X-ray Expert Models. [Paper][Code]
Unleashing the Potential of SAM for Medical Adaptation via Hierarchical Decoding. [Paper][Code]
Low-Rank Knowledge Decomposition for Medical Foundation Models. [Paper][Code]

Vision-Language Model (视觉-语言)

PairAug: What Can Augmented Image-Text Pairs Do for Radiology? [Paper][Code]
Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Matching Framework. [Paper][Code]
Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images. [Paper][Code]
OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM. [Paper][Code]
CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification. [Paper][Code]
FairCLIP: Harnessing Fairness in Vision-Language Learning [Paper][Code][推送]

Computational Pathology (计算病理)

Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction. [Paper]
Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology. [Paper][Code]
PrPSeg: Universal Proposition Learning for Panoramic Renal Pathology Segmentation. [Paper]
ChAda-ViT: Channel Adaptive Attention for Joint Representation Learning of Heterogeneous Microscopy Images. [Paper][Code]
SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology. [Paper][Code]
Transcriptomics-guided Slide Representation Learning in Computational Pathology [Paper][Code]

Others

Seeing Unseen: Discover Novel Biomedical Concepts via Geometry-Constrained Probabilistic Modeling. [Paper]
FocusMAE: Gallbladder Cancer Detection from Ultrasound Videos with Focused Masked Autoencoders. [Paper][Code]

Acknowledgement

Some CVPR 2025 papers sourced from https://github.com/cerishleon/cvpr25_medical_paper

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CVPR-MIA

CVPR2025

Image Generation (图像生成)

Image Segmentation (图像分割)

Medical Pre-training $ Foundation Model（预训练&基础模型）

Vision-Language Model (视觉-语言)

Computational Pathology (计算病理)

Others

CVPR2024

Image Reconstruction (图像重建)

Image Resolution (图像超分)

Image Registration (图像配准)

Image Segmentation (图像分割)

Image Generation (图像生成)

Image Classification (图像分类)

Federated Learning（联邦学习）

Medical Pre-training $ Foundation Model（预训练&基础模型）

Vision-Language Model (视觉-语言)

Computational Pathology (计算病理)

Others

Acknowledgement

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

MedAIerHHL/CVPR-MIA

Folders and files

Latest commit

History

Repository files navigation

CVPR-MIA

CVPR2025

Image Generation (图像生成)

Image Segmentation (图像分割)

Medical Pre-training $ Foundation Model（预训练&基础模型）

Vision-Language Model (视觉-语言)

Computational Pathology (计算病理)

Others

CVPR2024

Image Reconstruction (图像重建)

Image Resolution (图像超分)

Image Registration (图像配准)

Image Segmentation (图像分割)

Image Generation (图像生成)

Image Classification (图像分类)

Federated Learning（联邦学习）

Medical Pre-training $ Foundation Model（预训练&基础模型）

Vision-Language Model (视觉-语言)

Computational Pathology (计算病理)

Others

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Packages