Skip to content

Zijian-Wu/HeadEngine

Repository files navigation

HeadEngine

HeadEngine synthetic data teaser

HeadEngine is the synthetic data generation pipeline for the CVPR 2026 Highlight paper UIKA: Fast Universal Head Avatar from Pose-Free Images.

It generates multi-view talking-head data and FLAME annotations from a synthetic identity. The main entry point is run_headengine.py.

Features

  • Generate 9 canonical source views for each synthetic identity.
  • Estimate multi-view camera parameters with VGGT.
  • Animate all views with one LivePortrait driving template.
  • Track FLAME parameters with VHAP and export NeRF-style data.
  • Resume, rerun, or visualize stages with per-stage markers.

Installation

This repository is intended for Linux workstations with NVIDIA GPUs.

git clone https://github.com/Zijian-Wu/HeadEngine.git
cd HeadEngine

Prepare the environment and model weights with install/README.md. All model weights should be placed under weight_zoo/ as described there.

Configuration

Pipeline parameters are centralized in headengine/config.py. Edit this file directly when changing paths, model weights, runtime settings, or generation parameters.

Common fields:

PathConfig.data_dir            # output root, default: data_pool
RuntimeConfig.gpu_per_node     # GPUs per node
RuntimeConfig.id_per_gpu       # identities processed by each GPU worker
RuntimeConfig.matte_bs         # StyleMatte batch size
LivePortraitConfig.driving     # driving .mp4 or .pkl template

RuntimeConfig.matte_bs defaults to 78, which is intended for a 48 GB GPU. If the matting stage runs out of VRAM, reduce it to 32, 16, or lower.

If LivePortraitConfig.driving points to a video such as assets/driving.mp4, the pipeline reuses a newer sibling assets/driving.pkl template when it exists; otherwise it regenerates the template from the video.

Usage

Run one process on a single GPU:

CUDA_VISIBLE_DEVICES=0 python run_headengine.py --node 0 --local-rank 0

Launch one worker per local GPU:

python multi_gpu_launcher.py 0

Useful options:

python run_headengine.py --node 0 --local-rank 0 --vis-mesh
python run_headengine.py --node 0 --local-rank 0 --force-stage animate_mv_img
python run_headengine.py --node 0 --local-rank 0 --force-all
python multi_gpu_launcher.py 0 --num-gpus 4 --start-delay 1 --vis-mesh

Stages for --force-stage: gen_mv_head_img, estimate_cam_using_vggt, animate_mv_img, track_mv_img, export_nerf_dataset, vis_mesh.

Outputs

For ID 000000, a typical run writes:

data_pool/
|-- src_img/
|   `-- 000000/                 # SphereHead source views
|-- fast_view/
|   `-- 000000.jpg              # 3x3 preview grid
|-- export/
|   `-- 000000/
|       |-- vggt_cam.json       # VGGT camera parameters
|       |-- images/             # animated RGB frames
|       |-- fg_masks/           # foreground masks
|       |-- transforms.json     # NeRF-style metadata
|       |-- flame_param/        # tracked FLAME parameters
|       `-- vis_mesh/           # optional image + FLAME mesh overlays
|-- vhap_track/
|   `-- 000000/                 # VHAP optimization outputs
|-- done_marks/
|   `-- 000000/                 # per-stage resume markers
`-- log/                        # multi-GPU launcher logs

Animated images, masks, and mesh overlays use the same <frame>_<view> naming scheme, for example images/00012_03.png, fg_masks/00012_03.png, and vis_mesh/00012_03.png.

Citation

If you find this project useful, please cite:

@misc{wu2026headengine,
    title     = {HeadEngine},
    author    = {Wu, Zijian},
    url       = {https://github.com/Zijian-Wu/HeadEngine},
    year      = {2026}
}
@inproceedings{wu2026uika,
    title     = {UIKA: Fast Universal Head Avatar from Pose-Free Images},
    author    = {Wu, Zijian and Zhou, Boyao and Hu, Liangxiao and Liu, Hongyu and Sun, Yuan and Wang, Xuan and Cao, Xun and Shen, Yujun and Zhu, Hao},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year      = {2026}
}

Acknowledgements

HeadEngine is part of UIKA and builds on VHAP, SphereHead, LivePortrait, and VGGT.

Please cite and follow the licenses of the upstream projects.

About

HeadEngine is the synthetic data generation pipeline for the CVPR 2026 Highlight paper UIKA: Fast Universal Head Avatar from Pose-Free Images.

Topics

Resources

Stars

Watchers

Forks

Contributors