Stars
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
Fully open reproduction of DeepSeek-R1
An LLM agent that conducts deep research (local and web) on any given topic and generates a long report with citations.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…
Solve Visual Understanding with Reinforced VLMs
Democratizing Reinforcement Learning for LLMs
Witness the aha moment of VLM with less than $3.
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
[Up-to-date] Large Language Model Agent: A Survey on Methodology, Applications and Challenges
[NeurIPS 2025] MMaDA - Open-Sourced Multimodal Large Diffusion Language Models
[ICLR 2025] Automated Design of Agentic Systems
[Survey] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
🐝 When Agent Meets RL and Prompt Optimization the First Time
A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Repository containing SoPs and other reference material for Graduate admission process.
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models
Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning
This is the official Python version of Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play.
This is the official Python version of Angles Don’t Lie: Unlocking Training-Efficient RL Through the Model’s Own Signals.
[NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
Official code implementation for 2025 ICLR accepted paper "Dobi-SVD : Differentiable SVD for LLM Compression and Some New Perspectives"
This is Official PyTorch implementation for 2023-NeurIPS-MathNAS: If Blocks Have a Role in Mathematical Architecture Design.
[ICML 2025] Official Repo for Stability-guided Adaptive Diffusion Acceleration. 🚀🌙Accelerating off-the-shelf diffusion model with a unified stability criterion.
[NeurIPS'25] KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems