Lists (1)
Sort Name ascending (A-Z)
Stars
implementing minimal versions of joint-embedding predictive architecture (JEPA)
Official implementation of "StructXLIP: Enhancing Vision-language Models with Multimodal Structural Cues", CVPR 2026.
An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.
Official Codebase for "Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights" (ICML 2026 Spotlight)
AI agents running research on single-GPU nanochat training automatically
[ICLR 2026 🔥 ] Official implementation of "UniLiP: Adapting CLIP for Unified Multimodal Understanding, Generation and Editing"
A course on aligning smol models.
Pytorch Lightning Implement of Generative Recommenders
刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.
(ICLR 2023) Official PyTorch implementation of "What Do Self-Supervised Vision Transformers Learn?"
A flexible and efficient codebase for training visually-conditioned language models (VLMs)
simple pytorch pipeline for pretraining/finetuning vision models on imagenet-1k
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems
Official Implementation of Paper Transfer between Modalities with MetaQueries
FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
[TACL/EMNLP'24] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study
Code for ICML 2025 Paper "Highly Compressed Tokenizer Can Generate Without Training"
A curated list of awesome papers on Embodied AI and related research/industry-driven resources.
👋 Overcomplete is a Vision-based SAE Toolbox
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
Open-source implementation of AlphaEvolve
A final sanity checklist to help your CS paper get accepted, not desk rejected.
The simplest, fastest repository for training/finetuning small-sized VLMs.
Implementing DeepSeek R1's GRPO algorithm from scratch