Stars
Scalable toolkit for efficient model reinforcement
Multimodal RL training framework for diffusion & omni models
[ICML 2026] Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions
[ICLR 2026] An official implementation of "CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning"
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training
Codebase for Merging Language Models (ICML 2024)
Official Implementation of OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning
π₯ OneThinker: All-in-one Reasoning Model for Image and Video [CVPR 2026]
[CVPR 2026] Official repo for "VideoSSR: Video Self-Supervised Reinforcement Learning"
[CVPR 2026] Boosting Reasoning in Large Multimodal Models via Activation Replay
Official code for "Rethinking Chain-of-Thought Reasoning for Videos"
[CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
[CVPRF 2026] Official PyTorch code of "Weaver: End-to-End Agentic System Training for Video Interleaved Reasoning".
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
[ICLR2026] VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling
GLM-4.6V/4.5V/4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
RoboBrain 2.5: Advanced version of RoboBrain. Depth in Sight, Time in Mind. πππ
Spirit-v1.5: A Robotic Foundation Model by Spirit AI
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing
Building General-Purpose Robots Based on Embodied Foundation Model
π€ LeRobot: Making AI for Robotics more accessible with end-to-end learning