Skip to content
View ZhendongWang6's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report ZhendongWang6

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Qwen-Image-Layered: Layered Decomposition for Inherent Editablity

Python 547 35 Updated Dec 22, 2025

PersonaLive! : Expressive Portrait Image Animation for Live Streaming

Python 657 67 Updated Dec 19, 2025

DDT: Decoupled Diffusion Transformer

Python 344 17 Updated Aug 22, 2025

[ICCV 2025] Official implementation for KV-Edit: Training-Free Image Editing for Precise Background Preservation

Python 361 17 Updated May 21, 2025

A survey for visual generation alignment

102 8 Updated Nov 9, 2025

Native Multimodal Models are World Learners

Python 1,366 52 Updated Nov 28, 2025
Python 1,735 77 Updated Dec 16, 2025

EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Python 87 2 Updated Nov 29, 2025

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,640 54 Updated Nov 15, 2025

This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA benchmark performance. It also significantly improves the quality, fine-grain…

Python 80 Updated Sep 14, 2025

Qwen-Image text to image lora trainer

Python 654 58 Updated Dec 16, 2025

Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning.

Python 335 11 Updated Dec 22, 2025

HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

Python 2,600 123 Updated Oct 31, 2025

Official repository for the UAE paper, unified-GRPO, and unified-Bench

Python 151 6 Updated Sep 12, 2025
Python 579 16 Updated Nov 10, 2025

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 6,447 362 Updated Dec 19, 2025

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset

Python 238 5 Updated Aug 15, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 12,971 1,511 Updated Dec 17, 2025

PyTorch code and models for VJEPA2 self-supervised learning from video.

Python 2,576 257 Updated Aug 28, 2025

Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"

Jupyter Notebook 165 4 Updated Dec 17, 2025

[ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformer

Python 1,863 140 Updated Jul 3, 2025

[ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling

Python 3,141 306 Updated Dec 21, 2024

[NeurIPS 2025 Oral] Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think

Python 230 17 Updated Oct 4, 2025

The world's first open-source multimodal creative assistant This is a substitute for Canva and Manus that prioritizes privacy and is usable locally.

TypeScript 5,542 498 Updated Nov 10, 2025

The best OSS video generation models, created by Genmo

Python 3,539 468 Updated Nov 14, 2025

Enjoy the magic of Diffusion models!

Python 11,187 1,055 Updated Dec 20, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 14,964 2,216 Updated Dec 15, 2025

A curated list of recent diffusion models for video generation, editing, and various other applications.

5,304 327 Updated Dec 15, 2025
Next