Stars
Taming large-scale full-parameter few-step training with self-adversarial flows! 👏🏻
MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE
An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
Dimple, the first Discrete Diffusion Multimodal Large Language Model
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
dInfer: An Efficient Inference Framework for Diffusion Language Models
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing and Generation''
A high-throughput and memory-efficient inference and serving engine for LLMs
Enjoy the magic of Diffusion models!
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.
[AAAI 2026] VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
General technology for enabling AI capabilities w/ LLMs and MLLMs