Stars
Source code for 👏🏻"CLAP: Contrastive Latent Action Pretraining for Learning Vision-Language-Action Models from Human Videos"
RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI
Continuous-Time Distribution Matching for Few-Step Diffusion Distillation👏
Implementation for the paper "StraTA: Incentivizing Agentic Reinforcement Learning with Strategic Trajectory Abstraction".
StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing
Gen-Searcher: Reinforcing Agentic Search for Image Generation
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
[ICML 2026] Multimodal deep-research MLLM and benchmark. The first long-horizon multimodal deep-research MLLM, extending the number of reasoning turns to dozens and the number of search-engine inte…
Robust recipes to align language models with human and AI preferences
Code for "Diffusion Model Alignment Using Direct Preference Optimization"
[ICML 2026] Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation" & Causal Forcing++
Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image Generation.
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency
[CVPR 2026] SpatialVID: A Large-Scale Video Dataset with Spatial Annotations
[ICLR 2026] Taming large-scale few-step training with self-adversarial flows! 👏🏻
The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free