-
Tsinghua University
- Beijing
-
22:50
(UTC +08:00) - robertluo1.github.io
Lists (3)
Sort Name ascending (A-Z)
Stars
Welcome to GR00T Whole-Body Control (WBC)! This is a unified platform for developing and deploying advanced humanoid controllers. This includes: Decoupled WBC models used in NVIDIA Isaac-Gr00t, Gr0…
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds
[ICLR 26] TempFlow-GRPO (Temporal Flow GRPO), a principled GRPO framework that captures and exploits the temporal structure inherent in flow-based generation.
JoyAI-Image is the unified multimodal foundation model for image understanding, text-to-image generation, and instruction-guided image editing.
Single-stage End-to-End Training for Tokenization and Generation
[CVPR2026 Highlight] Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens https://arxiv.org/abs/2603.19232
[CVPR 2026 Highlight] Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Official code repository for "Self-transcendence: Is External Feature Guidance Indispensable for Accelerating Diffusion Transformer Training?"
Helios: Real Real-Time Long Video Generation Model
BitDance & UniWeTok: Open-source autoregressive model with binary visual tokens. A research project for building powerful multimodal autoregressive model.
[ICLR 2025] Binary Spherical Quantization + [CVPR 2026] Leech Spherical Quantization
Official Code of "Distribution Matching Distillation Meets Reinforcement Learning"
A Curated List of Awesome Video World Models with AR Diffusion: Covering Algorithms, Applications, and Infrastructure, Aimed at Serving as a Comprehensive Resource for Researchers, Practitioners, a…
Qwen3.6 is the large language model series developed by Qwen team, Alibaba Group.
[CVPR 2026 Highlight🔥] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE
A unified inference and post-training framework for accelerated video generation.
Official repository for “PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss”
The offical code of Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis
Code for the NIPS17 paper "Stabilizing Training of Generative Adversarial Networks through Regularization"
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders