Stars
PhyEdit: Towards Real-World Object Manipulation via Physically-Grounded Image Editing
Qwen-Image text to image lora trainer
All That Glitters Is Not Gold: Key-Secured 3D Secrets within 3D Gaussian Splatting
[ICLR 2026] "Does FLUX Already Know How to Perform Physically Plausible Image Composition?" (Official Implementation)
[ICLR 2026] "DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing" (Official Implementation)
[CVPR 2026 Highlight] VideoCoF: Unified Video Editing with Temporal Reasoner
[ICLR 2026] ContextGen: Contextual Layout Anchoring for Identity-Consistent Multi-Instance Generation
[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
[NIPS 2025] Seg2Any: Open-set Segmentation-Mask-to-Image Generation with Precise Shape and Semantic Control
[ICLR 2026 Oral (top 1.2%)] Official implementation of DepthLM
[ICLR 2026] BideDPO: Conditional Image Generation with Simultaneous Text and Condition Alignment
(NeurIPS 2025 D&B Track) OverLayBench: A Benchmark for Layout-to-Image Generation with Dense Overlaps
[ICLR 2026] Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Lear…
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"
High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning
Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"
[ACMMM 2025] "Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts" (Official Implementation)
A Collection of Papers and Codes for CVPR2026/CVPR2025/ICCV2025/CVPR2024/ECCV2024 AIGC
Calligrapher: Freestyle Text Image Customization
Unified layout planning and image generation, ICCV2025
This repository open-sources CreatiPoster, an AI-driven graphic design generation system for multi-layer and editable compositions with strong visual appeal.
ControlThinker: Unveiling Latent Semantics for Controllable Image Generation through Visual Reasoning
Implementation of "EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer"(ICCV2025)
[ICLR 2026] A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design
[NeurIPS 2025] IEAP: Image Editing As Programs with Diffusion Models
Layout Conditioned Image Generation, NeurIPS2024