-
SYSU
- ShenZhen, China
- https://github.com/zimenglan-sysu-512
- https://blog.csdn.net/zimenglan_sysu
Stars
[NeurIPS 2025 D&B🔥] OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation
Development code and experiments for training diffusion models from scratch
Official repository for the paper "PixVerve: Advancing Native UHR Image Generation to 100MP with a Large-Scale High-Quality Dataset"
[ICML 2026] The offical code of Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis
JLT: Clean-Latent Prediction in Latent Diffusion Transformers
🚀 [ICLR 2026] SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation
PiD: Fast and High-Resolution Latent Decoding with Pixel Diffusion
IQA: Deep Image Structure and Texture Similarity Metric
The official code repo of 1.x-Distill, is a stagewise distillation framework for diversity, high-quality and efficient few-step generation, with support for fractional-step inference and MLP-based …
Continuous-Time Distribution Matching for Few-Step Diffusion Distillation👏
Official implementation of Tuna-2: Pixel Embeddings Beat Vision Encoders for Unified Understanding and Generation
🔥 Official impl. of "DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing".
[CVPR 2026 Best Paper Finalist] Pixel Diffusion Transformers for Image Generation
Modular SenseNova skills for building AI-powered office assistants and productivity workflows
SenseNova-U series: Native Unified Paradigm with NEO-unify from the First Principles
[CVPR 2026] PhotoFramer: Multi-modal Image Composition Instruction
ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation
[ICLR 2026] Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation
[ICLR 2026 Oral] DiffusionNFT: Online Diffusion Reinforcement with Forward Process
[ICLR 2026] Any-step Generation via N-th Order Recursive Consistent Velocity Field Estimation
[CVPR2026] VOSR: A Vision-Only Generative Model for Image Super-Resolution
GEditBench v2: A Human-Aligned Benchmark for General Image Editing
DA-VAE: Plug-in Latent Compression for Diffusion via Detail Alignment (CVPR 2026)
[CVPR'26 Demo] Mobile-O: Unified Multimodal Understanding and Generation on Mobile Device
Official datasets released of 'UniSER: A Foundation Model for Unified Soft Effects Removal', CVPR 2026.
[Official Repo] SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing
JoyAI-Image is the unified multimodal foundation model for image understanding, text-to-image generation, and instruction-guided image editing.
(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis