Stars
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Benchmarking Multi-Image Understanding in Vision and Language Models
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
[ICLR 2025] VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning
[ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.
[T-PAMI] A curated list of self-supervised multimodal learning resources.
[ICML 2024] Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations
🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".
Implementation of "Meta Omnium: A Benchmark for General-Purpose Learning-to-Learn"
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
A playbook for systematically maximizing the performance of deep learning models.
Collection of LaTeX resources and examples.
SSSegmentation: An Open Source Supervised Semantic Segmentation Toolbox Based on PyTorch.
conST: an interpretable multi-modal contrastive learning framework for spatial transcriptomics
[ICLR 2023 spotlight] MEDFAIR: Benchmarking Fairness for Medical Imaging
Companion webpage to the book "Mathematics For Machine Learning"