Stars
[NeurIPS 2025] Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
The official implementation of paper “VChain: Chain-of-Visual-Thought for Reasoning in Video Generation”
Code for CineScale, higher-resolution video generation based on Wan
Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)
Inception Score for GANs in Pytorch
Pytorch implementation of common image generation metrics.
Lets make video diffusion practical!
Wan: Open and Advanced Large-Scale Video Generative Models
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
🔥🔥🔥A curated list of papers on recent diffusion-based high-resolution image and video synthesis works.
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
[ICCV 2025] Official Implementation for "Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition"
[ICCV 2025] Code for FreeScale, a tuning-free method for higher-resolution visual generation
HunyuanVideo: A Systematic Framework For Large Video Generation Model
A port of muerrilla's sd-webui-Detail-Daemon as a node for ComfyUI, to adjust sigmas that control detail.
Quick scripts to calculate CLIP text-image similarity
PyTorch - FID calculation with proper image resizing and quantization steps [CVPR 2022]
High-fidelity performance metrics for generative models in PyTorch
Improved AnimateDiff for ComfyUI and Advanced Sampling Support
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
[TPAMI 2025] ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis
High-resolution models for human tasks.
Official inference repo for FLUX.1 models
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation