Highlights
- Pro
Lists (4)
Sort Name ascending (A-Z)
Stars
UniRL is a Framework for Unified Multimodal Model Reinforcement Learning
Towards Holistic evaluation of Generative Diffusion Transformers!
[CVPR2026] LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories
The agent that grows with you
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds
[ICML 2026] ByteDance's All-in-One Video Generation Model for Human-Object Interaction Video Generation
AI agents running research on single-GPU nanochat training automatically
Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.
ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works…
ArgusBot: A 24/7 supervisor Agent for Codex CLI and Claude Code CLI that keeps agents running, reviewing, and planning until the job is actually done.
Fast, Sharp & Reliable Agentic Intelligence
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
A unified framework for easy reinforcement learning in Flow-Matching models
A Systematic Alignment Framework for High-Fidelity, Controllable, and Robust Video Generation.
Official implementation of AsymFlow, pi-Flow, GMFlow
Streaming Flux editor: live camera→ editing every frames at interactive FPS based on FLUX.2-Klein-4B. Runs on a single H100 at 15+ FPS
[Tech Report] Alive: A Unified Audio-Video Generation Model
A survey for visual generation alignment
Official code for paper Advantage Weighted Matching: Aligning RL with Pretraining in Diffusion Models
Scalable group inference for generating high quality and diverse images with diffusion models.
LoRA fine-tuning for FLUX.2 to improve virtual try-on (VTON) capabilities
Implementation of Particle Guidance: non-I.I.D. Diverse Sampling with Diffusion Models
[Tutorial] Few-Step Distillation for Text-to-Image Generation: A Practical Guide
Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the future state of the world in the form of video.
[ICLR 2026] Official implementation for What matters for Representation Alignment: Global Information or Spatial Structure?