TwinPose: Person-Specific Subspaces for Multi-View 3D Pose Estimation

News

2026-06-08: 🔗 Code is now available at https://github.com/HYPER-THEORY/TwinPose.
2026-05-06: 🎉 TwinPose has been accepted to SIGGRAPH 2026 Journal Track (ACM Transactions on Graphics)!
2024-11-01: 🚀 TwinPose was successfully developed and integrated into our self-developed real-time multi-view motion capture system.

Introduction

Following the success of deep neural networks in 2D pose estimation, reconstruction-based approaches have significantly advanced multi-person 3D pose estimation from sparse multi-view images. These methods typically detect 2D poses independently in each view and then associate them for 3D reconstruction. However, despite strong progress, recent state-of-the-art methods still face critical limitations: 1) They often depend on global optimization over a large and complex set of multi-view 2D joints to jointly infer 3D poses for all individuals, making the process highly complex and prone to suboptimal solutions; 2) Their tight coupling with the bottom-up detector OpenPose hinders the use of more advanced top-down or single-stage 2D pose estimators and restricts the integration of richer instance-level cues learned by these models.

To address these limitations, we propose TwinPose, a novel framework that alleviates the complexity of global pose inference by optimizing within person-specific 3D pose subspaces, while fully supporting diverse 2D pose detectors and effectively leveraging pose-instance cues. The key idea is to introduce a twin pose — a 3D counterpart of each 2D pose — that inherits its instance representation and aggregates geometrically consistent 2D joints from other views. All twin poses are unified in a common 3D space, where those belonging to the same individual naturally share a number of bones. This structural property enables association by counting shared bones, forming person-specific subspaces from which each individual’s 3D pose can be inferred independently in an efficient and robust manner.

Extensive experiments demonstrate that TwinPose achieves state-of-the-art performance in both accuracy and efficiency across multiple public and proprietary datasets. Importantly, it is fully detector-agnostic, allowing seamless integration with current and future advances in 2D pose estimation while remaining highly robust to noisy or imperfect 2D predictions.

Perspective and Broader Impact

TwinPose reflects our observation-first view of multi-view 3D motion capture: the quality of 2D observations from each camera determines the upper bound of 3D pose estimation. The goal of TwinPose is to make this upper bound easier to approach in practice. By building person-specific 3D pose subspaces, TwinPose avoids heavy global optimization, supports arbitrary 2D human pose detectors, and provides a scalable framework for future improvements driven by stronger 2D pose estimation models.

This framework also connects naturally with our broader research on video-based 2D pose estimation, including DSTA, PAVE-Net, and TAR-ViTPose. These works systematically explore how temporal information can be used to improve 2D pose estimation, with the hope of moving beyond the dominant single-frame paradigm toward a more robust video-based paradigm.

Quantitative Performance

with the fastest per-frame time (e.g., 0.92 ms on Shelf) and full flexibility to work with any 2D pose detector (e.g, HRNet, RTMO, and OpenPose).

Quantitative comparison on the Shelf dataset.

Method	A1	A2	A3	Avg	Time (ms)
Tanke and Gall [2019]	99.8	90.0	98.0	96.0	N/A
Bridgeman et al. [2019]	99.3	91.6	97.6	96.2	9.1
Dong et al. [2019]	98.8	94.1	97.8	96.9	90
Chen et al. [2020a]	99.6	93.2	97.5	96.8	3.08
Tu et al. [2020]	99.3	94.1	97.6	97.0	333
Huang et al. [2020]	98.8	96.2	97.2	97.4	640
Zhang et al. [2020]	99.0	96.2	97.6	97.6	31.9
Dong et al. [2021]	99.1	93.5	98.1	96.9	N/A
Wang et al. [2021]	99.3	95.1	97.8	97.4	~170
Wu et al. [2021]	99.3	96.5	97.3	97.7	~48.8
Reddy et al. [2021]	99.1	96.3	98.3	97.9	>333
Lin and Lee [2021]	99.3	96.5	98.0	97.9	23.4
Zhang et al. [2021]	99.5	97.0	97.8	98.1	>31.9
Zhou et al. [2022]	99.5	96.7	98.2	98.1	2.94
Choudhury et al. [2023]	99.0	96.3	98.2	97.8	N/A
Liao et al. [2024]	99.5	96.8	97.8	98.0	210
TwinPose (Ours)	99.8	96.2	98.5	98.2	0.92

Quantitative comparison on the 4DA dataset.

Method	2D Detector	Precision (%)	Recall (%)
Methods tightly coupled to the bottom‑up detector OpenPose
Zhang et al. [2020]	OpenPose	88.5	90.2
Dong et al. [2021]	OpenPose	90.1	89.0
Zhou et al. [2022]	OpenPose	92.0	91.2
Detector‑agnostic methods (any 2D pose detector)
Dong et al. [2019]	OpenPose	78.5	77.1
Dong et al. [2019]	HRNet	84.9	84.9
Dong et al. [2019]	RTMO	85.4	85.5
TwinPose (Ours)	OpenPose	91.4	90.4
TwinPose (Ours)	HRNet	94.3	93.2
TwinPose (Ours)	RTMO	94.8	95.0

Quantitative comparison on the Hi4D dataset.

Method	MPJPE ↓	PCP ↑	AP₅₀ ↑	AP₁₀₀ ↑	Recall ↑
Dong et al. [2019]	53.05	87.57	67.97	80.28	93.80
Zhang et al. [2020]	41.29	88.62	80.87	97.27	98.78
Lu et al. [2024a]	32.10	96.90	91.48	97.33	98.78
TwinPose (Ours)	22.00	99.71	90.80	99.34	99.86

Qualitative Results

Comparison with skeleton-level association method [Dong et al. 2019]. Traditional skeleton-level association approaches indiscriminately use all joints and bones, leading to incorrect associations (red boxes). TwinPose preserves only cross-view geometrically consistent joints, substantially improving robustness.

Comparison with the state-of-the-art 4DA method [Zhang et al. 2020]. Global optimization in 4DA causes incorrect cross-person associations (red boxes). TwinPose performs person-specific inference in pose subspaces, enhancing both robustness and efficiency.

Whole-body 3D pose estimation results of our method on the Panoptic dataset. Results from eight camera views demonstrate consistent multi-view reconstructior of body, hands, feet, and facial keypoints.

Video Demo

For a complete video demonstration of our methods, please see this YouTube video.

simpledemo.mp4

Citations

If you find our paper useful in your research, please consider citing:

@article{yang2026twinpose,
  title         = {TwinPose: Person-Specific Subspaces for Multi-View 3D Pose Estimation},
  author        = {Yang, Wenwu and He, Tianyi and Ding, Jiwei and Wang, Xun and Zhang, Rong and Zhou, Kun},
  journal       = {ACM Transactions on Graphics},
  volume        = {45},
  number        = {4},
  articleno     = {61},
  year          = {2026},
  note          = {SIGGRAPH 2026 Journal Track}
}

@article{yang2023lightweight,
  title         = {Lightweight Multi-Person Motion Capture System in the Wild},
  author        = {Yang, Wenwu and Li, Yue and Xing, Shuai and Cai, Jiahang and Wang, Xun},
  journal       = {SCIENTIA SINICA Informationis},
  volume        = {53},
  number        = {11},
  pages         = {2230--2249},
  year          = {2023},
  note          = {In Chinese}
}

Acknowledgement

We thank Tianyi He for implementing the TwinPose algorithm; Jiwei Ding for his assistance with the quantitative and qualitative experiments; Yihui Sun and Bin Zhou for their assistance with the experiments on whole-body 3D pose estimation and learning-based methods; Siying Chen for video editing and homepage development; Xiongbin Lin for video editing; and all participants who contributed to the motion capture data collection.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
static		static
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TwinPose: Person-Specific Subspaces for Multi-View 3D Pose Estimation

News

Introduction

Perspective and Broader Impact

Quantitative Performance

Qualitative Results

Video Demo

Citations

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TwinPose: Person-Specific Subspaces for Multi-View 3D Pose Estimation

News

Introduction

Perspective and Broader Impact

Quantitative Performance

Qualitative Results

Video Demo

Citations

Acknowledgement

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages