HeadEngine is the synthetic data generation pipeline for the CVPR 2026 Highlight paper UIKA: Fast Universal Head Avatar from Pose-Free Images.
It generates multi-view talking-head data and FLAME annotations from a synthetic
identity. The main entry point is run_headengine.py.
- Generate 9 canonical source views for each synthetic identity.
- Estimate multi-view camera parameters with VGGT.
- Animate all views with one LivePortrait driving template.
- Track FLAME parameters with VHAP and export NeRF-style data.
- Resume, rerun, or visualize stages with per-stage markers.
This repository is intended for Linux workstations with NVIDIA GPUs.
git clone https://github.com/Zijian-Wu/HeadEngine.git
cd HeadEnginePrepare the environment and model weights with install/README.md.
All model weights should be placed under weight_zoo/ as described there.
Pipeline parameters are centralized in headengine/config.py.
Edit this file directly when changing paths, model weights, runtime settings, or generation parameters.
Common fields:
PathConfig.data_dir # output root, default: data_pool
RuntimeConfig.gpu_per_node # GPUs per node
RuntimeConfig.id_per_gpu # identities processed by each GPU worker
RuntimeConfig.matte_bs # StyleMatte batch size
LivePortraitConfig.driving # driving .mp4 or .pkl template
RuntimeConfig.matte_bs defaults to 78, which is intended for a 48 GB GPU. If
the matting stage runs out of VRAM, reduce it to 32, 16, or lower.
If LivePortraitConfig.driving points to a video such as assets/driving.mp4,
the pipeline reuses a newer sibling assets/driving.pkl template when it exists;
otherwise it regenerates the template from the video.
Run one process on a single GPU:
CUDA_VISIBLE_DEVICES=0 python run_headengine.py --node 0 --local-rank 0Launch one worker per local GPU:
python multi_gpu_launcher.py 0Useful options:
python run_headengine.py --node 0 --local-rank 0 --vis-mesh
python run_headengine.py --node 0 --local-rank 0 --force-stage animate_mv_img
python run_headengine.py --node 0 --local-rank 0 --force-all
python multi_gpu_launcher.py 0 --num-gpus 4 --start-delay 1 --vis-meshStages for --force-stage: gen_mv_head_img, estimate_cam_using_vggt,
animate_mv_img, track_mv_img, export_nerf_dataset, vis_mesh.
For ID 000000, a typical run writes:
data_pool/
|-- src_img/
| `-- 000000/ # SphereHead source views
|-- fast_view/
| `-- 000000.jpg # 3x3 preview grid
|-- export/
| `-- 000000/
| |-- vggt_cam.json # VGGT camera parameters
| |-- images/ # animated RGB frames
| |-- fg_masks/ # foreground masks
| |-- transforms.json # NeRF-style metadata
| |-- flame_param/ # tracked FLAME parameters
| `-- vis_mesh/ # optional image + FLAME mesh overlays
|-- vhap_track/
| `-- 000000/ # VHAP optimization outputs
|-- done_marks/
| `-- 000000/ # per-stage resume markers
`-- log/ # multi-GPU launcher logs
Animated images, masks, and mesh overlays use the same <frame>_<view> naming
scheme, for example images/00012_03.png, fg_masks/00012_03.png, and
vis_mesh/00012_03.png.
If you find this project useful, please cite:
@misc{wu2026headengine,
title = {HeadEngine},
author = {Wu, Zijian},
url = {https://github.com/Zijian-Wu/HeadEngine},
year = {2026}
}@inproceedings{wu2026uika,
title = {UIKA: Fast Universal Head Avatar from Pose-Free Images},
author = {Wu, Zijian and Zhou, Boyao and Hu, Liangxiao and Liu, Hongyu and Sun, Yuan and Wang, Xuan and Cao, Xun and Shen, Yujun and Zhu, Hao},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2026}
}HeadEngine is part of UIKA and builds on VHAP, SphereHead, LivePortrait, and VGGT.
Please cite and follow the licenses of the upstream projects.