ZhendongWang6

Follow

🎯

Focusing

Zhendong Wang ZhendongWang6

🎯

Focusing

Follow

Ph.D. student, focus on computer vision and deep learning.

131 followers · 53 following

University of Science and Technology of China
Beijing, China
13:19 (UTC +08:00)
https://zhendongwang6.github.io/
https://scholar.google.com.hk/citations?user=Ya5VDjQAAAAJ&hl=zh-CN

Achievements

Achievements

Highlights

Pro

Lists (27)

Sort

chatgpt

clip

controlnet

dataset

diffusion model

67 repositories

face-anti-spoofing

10 repositories

face-forgery-detection

23 repositories

flow

gan

img2img

interview

knowledge distillation

large language models

10 repositories

large vision model

ocr

pretrain

r1

sam系列

score metrics

segmentation

subject driven generation

survey

tools

16 repositories

vae

10 repositories

video generation

vision_language

visual text generation

Stars

QwenLM / Qwen-Image-Layered

Qwen-Image-Layered: Layered Decomposition for Inherent Editablity

Python 547 35 Updated Dec 22, 2025

GVCLab / PersonaLive

PersonaLive! : Expressive Portrait Image Animation for Live Streaming

Python 657 67 Updated Dec 19, 2025

meituan-longcat / LongCat-Image

Python 490 37 Updated Dec 16, 2025

MCG-NJU / DDT

DDT: Decoupled Diffusion Transformer

Python 344 17 Updated Aug 22, 2025

Xilluill / KV-Edit

[ICCV 2025] Official implementation for KV-Edit: Training-Free Image Editing for Precise Background Preservation

Python 361 17 Updated May 21, 2025

XueZeyue / Awesome-Visual-Generation-Alignment-Survey

A survey for visual generation alignment

102 8 Updated Nov 9, 2025

baaivision / Emu3.5

Native Multimodal Models are World Learners

Python 1,366 52 Updated Nov 28, 2025

apple / pico-banana-400k

Python 1,735 77 Updated Dec 16, 2025

TIGER-AI-Lab / EditReward

EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Python 87 2 Updated Nov 29, 2025

bytetriper / RAE

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,640 54 Updated Nov 15, 2025

Osilly / Interleaving-Reasoning-Generation

This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA benchmark performance. It also significantly improves the quality, fine-grain…

Python 80 Updated Sep 14, 2025

FlyMyAI / flymyai-lora-trainer

Qwen-Image text to image lora trainer

Python 654 58 Updated Dec 16, 2025

HorizonWind2004 / reconstruction-alignment

Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning.

Python 335 11 Updated Dec 22, 2025

Tencent-Hunyuan / HunyuanImage-3.0

HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

Python 2,600 123 Updated Oct 31, 2025

PKU-YuanGroup / UAE

Official repository for the UAE paper, unified-GRPO, and unified-Bench

Python 151 6 Updated Sep 12, 2025

stepfun-ai / NextStep-1

Python 579 16 Updated Nov 10, 2025

QwenLM / Qwen-Image

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 6,447 362 Updated Dec 19, 2025

wyhlovecpp / GPT-Image-Edit

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset

Python 238 5 Updated Aug 15, 2025

Wan-Video / Wan2.2

Wan: Open and Advanced Large-Scale Video Generative Models

Python 12,971 1,511 Updated Dec 17, 2025

facebookresearch / vjepa2

PyTorch code and models for VJEPA2 self-supervised learning from video.

Python 2,576 257 Updated Aug 28, 2025

fudan-generative-vision / PPFlow

Python 7 1 Updated Jul 20, 2025

Jiawei-Yang / DeTok

Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"

Jupyter Notebook 165 4 Updated Dec 17, 2025

Yuanshi9815 / OminiControl

[ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformer

Python 1,863 140 Updated Jul 3, 2025

jy0205 / Pyramid-Flow

[ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling

Python 3,141 306 Updated Dec 21, 2024

Martinser / REG

[NeurIPS 2025 Oral] Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think

Python 230 17 Updated Oct 4, 2025

11cafe / jaaz

The world's first open-source multimodal creative assistant This is a substitute for Canva and Manus that prioritizes privacy and is usable locally.

TypeScript 5,542 498 Updated Nov 10, 2025

genmoai / mochi

The best OSS video generation models, created by Genmo

Python 3,539 468 Updated Nov 14, 2025

modelscope / DiffSynth-Studio

Enjoy the magic of Diffusion models!

Python 11,187 1,055 Updated Dec 20, 2025

Wan-Video / Wan2.1

Wan: Open and Advanced Large-Scale Video Generative Models

Python 14,964 2,216 Updated Dec 15, 2025

showlab / Awesome-Video-Diffusion

A curated list of recent diffusion models for video generation, editing, and various other applications.

5,304 327 Updated Dec 15, 2025