🧢 CAP4D

Official repository for the paper

CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models, CVPR 2025 (Oral).

Felix Taubner^1,2, Ruihang Zhang¹, Mathieu Tuli³, David B. Lindell^1,2

¹University of Toronto, ²Vector Institute, ³LG Electronics

TL;DR: CAP4D turns any number of reference images into an animatable avatar.

⚡️ Quick start guide

🛠️ 1. Create conda environment and install requirements

# 1. Clone repo
git clone https://github.com/felixtaubner/cap4d/
cd cap4d

# 2. Create conda environment for CAP4D:
conda create --name cap4d_env python=3.10
conda activate cap4d_env

# 3. Install requirements
pip install -r requirements.txt

Follow the instructions and install Pytorch3D. Make sure to install with CUDA support. We recommend to install from source: pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"

📦 2. Download FLAME and MMDM weights

Follow instructions on the FLAME website to download the FLAME blendshapes files. Locate flame2023_no_jaw.pkl and place it in data/assets/flame/.

Download the MMDM weights with this link, and place cap4d_mmdm_100k.ckpt in data/weights/mmdm/checkpoints/.

✅ 3. Check installation with a test run

Run the pipeline in debug settings to test the installation.

bash scripts/test_pipeline.sh

Check if a video is exported to examples/debug_output/tesla/sequence_00/renders.mp4. If it appears to show a blurry cartoon Nicola Tesla, you're all set!

🎬 4. Inference

Run the provided scripts to generate avatars and animate them with a single script:

bash scripts/generate_felix.sh
bash scripts/generate_lincoln.sh
bash scripts/generate_tesla.sh

The output directories contain exported animations which you can view in real-time. Open the real-time viewer in your browser (powered by Brush). Click Load file and upload the exported animation found in examples/output/{SUBJECT}/animation_{ID}/exported_animation.ply.

🔧 Custom inference

⚙️ 1. Run FlowFace tracking

Coming soon! For now, only generations using the provided identities with precomputed FlowFace annotations are supported.

🖼️ 2. Generate images using MMDM

# Generate images with single reference image
python cap4d/inference/generate_images.py --config_path configs/generation/single_ref.yaml --reference_data_path examples/input/lincoln/ --output_path examples/output/lincoln/

# Generate images with multiple reference images
python cap4d/inference/generate_images.py --config_path configs/generation/multi_ref.yaml --reference_data_path examples/input/felix/ --output_path examples/output/felix/

Note: the generation script will use all visible CUDA devices. The more available devices, the faster it runs! This will take hours, and requires lots of RAM (ideally > 64 GB) to run smoothly.

👤 3. Fit Gaussian avatar

python gaussianavatars/train.py --config_path configs/avatar/default.yaml --source_paths examples/output/{SUBJECT}/reference_images/ examples/output/{SUBJECT}/generated_images/ --model_path examples/output/{SUBJECT}/avatar/ --interval 5000

🕺 4. Animate your avatar

For now, only animations with precomputed FLAME annotations are supported. These animations are located in examples/input/animation/.

python gaussianavatars/animate.py --model_path examples/output/lincoln/avatar/ --target_animation_path examples/input/animation/sequence_00/fit.npz  --target_cam_trajectory_path examples/input/animation/sequence_00/orbit.npz  --output_path examples/output/lincoln/animation_00/ --export_ply 1 --compress_ply 0

The --target_animation_path contains FLAME expressions and pose, while the (optional) --target_cam_trajectory_path contains the relative camera trajectory.

📚 Related Resources

The MMDM code is based on ControlNet. The 4D Gaussian avatar code is based on GaussianAvatars. Special thanks to the authors for making their code public!

Related work:

CAT3D: Create Anything in 3D with Multi-View Diffusion Models
GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians
FlowFace: 3D Face Tracking from 2D Video through Iterative Dense UV to Image Flow
StableDiffusion: High-Resolution Image Synthesis with Latent Diffusion Models

Awesome concurrent work:

Pippo: High-Resolution Multi-View Humans from a Single Image
Avat3r: Large Animatable Gaussian Reconstruction Model for High-fidelity 3D Head Avatars

📖 Citation

@inproceedings{taubner2025cap4d,
    author    = {Taubner, Felix and Zhang, Ruihang and Tuli, Mathieu and Lindell, David B.},
    title     = {{CAP4D}: Creating Animatable {4D} Portrait Avatars with Morphable Multi-View Diffusion Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {5318-5330}
}

Acknowledgement

This work was developed in collaboration with and with sponsorship from LG Electronics. We gratefully acknowledge their support and contributions throughout the course of this project.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
assets		assets
cap4d		cap4d
configs		configs
controlnet		controlnet
data		data
debug		debug
examples/input		examples/input
flowface/flame		flowface/flame
gaussianavatars		gaussianavatars
outputs/run1		outputs/run1
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
face_index.png		face_index.png
generate_animation.py		generate_animation.py
generate_animation_camerahmr.py		generate_animation_camerahmr.py
generate_animation_multihmr.py		generate_animation_multihmr.py
remeshed_flame_model_128.obj		remeshed_flame_model_128.obj
remeshed_model_128.obj		remeshed_model_128.obj
remeshed_model_256.obj		remeshed_model_256.obj
rendered_image.png		rendered_image.png
requirements.txt		requirements.txt
right_hand_wave_animation.npz		right_hand_wave_animation.npz
test_env.py		test_env.py
uv_layout.png		uv_layout.png
uv_mask.png		uv_mask.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧢 CAP4D

⚡️ Quick start guide

🛠️ 1. Create conda environment and install requirements

📦 2. Download FLAME and MMDM weights

✅ 3. Check installation with a test run

🎬 4. Inference

🔧 Custom inference

⚙️ 1. Run FlowFace tracking

🖼️ 2. Generate images using MMDM

👤 3. Fit Gaussian avatar

🕺 4. Animate your avatar

📚 Related Resources

📖 Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

License

hitminxuanwang/cap4d

Folders and files

Latest commit

History

Repository files navigation

🧢 CAP4D

⚡️ Quick start guide

🛠️ 1. Create conda environment and install requirements

📦 2. Download FLAME and MMDM weights

✅ 3. Check installation with a test run

🎬 4. Inference

🔧 Custom inference

⚙️ 1. Run FlowFace tracking

🖼️ 2. Generate images using MMDM

👤 3. Fit Gaussian avatar

🕺 4. Animate your avatar

📚 Related Resources

📖 Citation

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages