Skip to content

ZhengdiYu/Dyn-HaMR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

21 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera

Zhengdi Yu ยท Stefanos Zafeiriou ยท Tolga Birdal

Imperial College London

CVPR 2025 (Highlight)

Wild GIF Global GIF

Introduction

We propose Dyn-HaMR to reconstruct 4D global hand motion from monocular videos recorded by dynamic cameras in the wild, as a remedy for the motion entanglement in the wild.

Table of Contents
  1. Installation
  2. Get Started
  3. Citation

News ๐Ÿšฉ

  • [2025/11/20] ๐Ÿš€ Major Update:

    • Integrated VIPE for camera estimation, significantly improving reconstruction quality over DROID-SLAM
    • Enhanced Hand Tracker with robust hallucination prevention and handedness correction for better hand tracking and significantly Improved temporal consistency. Please pip install ultralytics==8.1.34 since YOLO is using in this version (Thanks to WiloR). Please download the checkpoint from here and put it under third-party/hamer/pretrained_models.

    See comparison below:

    Before: Jitter from DROID-SLAM New: VIPE + enhanced HaMeR (Recommended)
    Before: Handedness shifting with original HaMeR New: Enhanced hand tracker (Recommended)
    Before: Handedness shifting with original HaMeR New: Enhanced hand tracker (Recommended)
  • [2025/06/04] Code released.

  • [2024/12/18] Paper is now available on arXiv. โญ

Installation

Environment setup

  1. Clone the repository with submodules with the following command:

     git clone --recursive https://github.com/ZhengdiYu/Dyn-HaMR.git
     cd Dyn-HaMR

    You can also run the following command to fetch the submodules:

    git submodule update --init --recursive .
  2. To set up the virtual environment for Dyn-HaMR, we provide the integrated commands in \scripts. You can create the environment from

    source install_pip.sh

    Or, alternatively, create the environment from conda:

    source install_conda.sh

Model checkpoints download

Please run the following command to fetch the data dependencies. This will create a folder in _DATA:

source prepare.sh

After processing, the folder layout should be:

|-- _DATA
|   |-- data/  
|   |   |-- mano/
|   |   |   |-- MANO_RIGHT.pkl
|   |   |-- mano_mean_params.npz
|   |-- BMC/
|   |-- hamer_ckpts/
|   |-- vitpose_ckpts/
|   |-- <SLAM model .pkl>

Prerequisites

We use MANO model for hand mesh representation. Please visit the MANO website for registration and the model downloading. Please download MANO_RIGHT.pkl and put under the _DATA/data/mano folder.

Get Started๐Ÿš€

Preparation

Please follow the instructions here to calculate the below .npz files in order dyn-hamr/optim/BMC/:

|-- BMC
|   |-- bone_len_max.npy
|   |-- bone_len_min.npy
|   |-- CONVEX_HULLS.npy
|   |-- curvatures_max.npy
|   |-- curvatures_min.npy
|   |-- joint_angles.npy
|   |-- PHI_max.npy
|   |-- PHI_min.npy

Note

If accurate camera parameters are available, please follow the format of Dyn-HaMR/test/dynhamr/cameras/demo/shot-0/cameras.npz to prepare the camera parameters for loading. Similarly, you can use Dyn-HaMR to refine and recover the hand mesh in the world coordinate system initializing from your own 2D & 3D motion data.

Customize configurations

Config Operation
GPU Edit in <CONFIG_GPU>
Video info Edit in <VIDEO_SEQ>
Interval Edit in <VIDEO_START_END>
Optimization configurations Edit in <OPT_WEIGHTS>
General configurations Edit in <GENERAL_CONFIG>

Fitting on RGB-(D) videos ๐ŸŽฎ

To run the optimization pipeline for fitting on arbitrary RGB-(D) videos, please first edit the path information here in dyn-hamr/confs/data/video.yaml, where root is the root folder to all of your datasets. video_dir is the corresponding folder that contains the videos. The key seq represents the video name you wanted to process. For example, you can run the following command to recover the global motion for test/videos/demo1.mp4:

๐ŸŒŸ Using VIPE for Camera Estimation (Recommended)

For significantly better camera estimation quality, use VIPE instead of DROID-SLAM:

python run_opt.py data=video_vipe run_opt=True data.seq=demo1 is_static=False

๐ŸŒŸ Using original DROID-SLAM for Camera Estimation

python run_opt.py data=video run_opt=True data.seq=demo1 is_static=<True or False>

VIPE will automatically run if results are not found. Make sure you have:

  1. Installed VIPE in third-party/vipe/ with conda environment named vipe
  2. Set src_path in dyn-hamr/confs/data/video_vipe.yaml to your video file

By default, the camera parameters will be predicted during the process and assumes a moving camera (is_static=False). If your video is recorded with a static camera, you can add is_static=True for more stable optimization. The result will be saved to outputs/logs/video-custom/<DATE>/<VIDEO_NAME>-<tracklet>-shot-<shot_id>-<start_frame_id>-<end_frame_id>. After optimization, you can specify the output log dir and visualize the results by running the following command:

python run_vis.py --log_root <LOG_ROOT>

This will visualize all log subdirectories and save the rendered videos and images, as well as saved 3D meshes in the world space in <LOG_ROOT>. Please visit run_vis.py for further details. Alternatively, you can also use the following command to run and visualize the results in one-stage:

python -u run_opt.py data=video_vipe run_opt=True run_vis=True is_static=<True of False>

As a multi-stage pipeline, you can customize the optimization process. Add is_static=True for static camera videos. Adding run_prior=True can activate the motion prior in stage III. Please note that in the current version, each motion chunk size needs to be set to 128 to be compatible with the original setting of HMP only when the prior module is activated.

Blender Addon

Coming soon.

Acknowledgements

The PyTorch implementation of MANO is based on manopth. Part of the fitting and optimization code of this repository is borrowed from SLAHMR. For data preprocessing and observation, ViTPose and HaMeR is used for 2D keypoints detection and MANO parameter initilization. For camera motion estimation, we support VIPE (recommended), DPVO, and DROID-SLAM. For biomechanical constraints and motion prior, we use the code from here and HMP. We thank all the authors for their impressive work!

License

Please see License for details of Dyn-HaMR. This code and model are available only for non-commercial research purposes as defined in the LICENSE (i.e., MIT LICENSE). Note that, for MANO you must agree with the LICENSE of it. You can check the LICENSE of MANO from https://mano.is.tue.mpg.de/license.html.

Citation

@inproceedings{yu2025dynhamr,
  title={Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera},
  author={Yu, Zhengdi and Zafeiriou, Stefanos and Birdal, Tolga},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month     = {June},
  year      = {2025},
}

Contact

For any technical questions, please contact z.yu23@imperial.ac.uk or ZhengdiYu@hotmail.com.

About

๐Ÿ”ฅ(CVPR 2025 Highlight) Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published