Lists (1)
Sort Name ascending (A-Z)
Stars
We present StableAvatar, the first end-to-end video diffusion transformer, which synthesizes infinite-length high-quality audio-driven avatar videos without any post-processing, conditioned on a re…
Memory-Guided Diffusion for Expressive Talking Video Generation
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".
Official implementation of `Splatter Image: Ultra-Fast Single-View 3D Reconstruction' CVPR 2024
(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
[ECCV2024] Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization
Segment Anything in 3D with NeRFs (NeurIPS 2023 & IJCV 2025)
Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition
🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"
A Framework for Speech, Language, Audio, Music Processing with Large Language Model
Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"
[ICLR 2024] Official pytorch implementation of "ControlVideo: Training-free Controllable Text-to-Video Generation"
Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions (ICCV 2023)
[CVPR 2025] The First Investigation of CoT Reasoning (RL, TTS, Reflection) in Image Generation
Lumina-Image 2.0: A Unified and Efficient Image Generative Framework
Official implementation of "LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching"
[NeurIPS 2024] A Generalizable World Model for Autonomous Driving
LongLive: Real-time Interactive Long Video Generation
CraftsMan: High-fidelity Mesh Generation with 3D Native Diffusion and Interactive Geometry Refiner
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)
[ICML 2024] MagicPose(also known as MagicDance): Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion
[SIGGRAPH 2025] Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control
[CVPR 2024] Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation