CrowdGaussian addresses the challenges of multi-person 3D reconstruction from a single image by utilizing a self-supervised adaptation pipeline and Self-Calibrated Learning to generate photorealistic and geometrically complete 3D Gaussian Splatting representations despite heavy occlusions and low clarity.
CVPR 2026
We present a unified framework for reconstructing animatable 3D human avatars from a single portrait by introducing a Dual-UV representation to handle pose/framing sensitivity and a factorized synthetic data manifold to achieve state-of-the-art generalization across head, half-body, and full-body inputs.
CVPR 2026 Project Page Code
UIKA introduces a feed-forward, animatable Gaussian head model that achieves state-of-the-art reconstruction from arbitrary unposed inputs by utilizing a UV-guided remapping strategy and learnable UV tokens to aggregate view-independent features into canonical Gaussian attributes.
CVPR 2026 Project Page
DeX-Portrait introduces a diffusion-based portrait animation framework that achieves high-fidelity, disentangled control over head pose and facial expression through a dual-branch conditioning mechanism and a progressive hybrid classifier-free guidance strategy.
CVPR 2026 Project Page
Pressure2Motion establishes a new state-of-the-art in privacy-preserving motion capture by introducing a hierarchical diffusion model that resolves the ambiguities of ground pressure data through the integration of dual-level pressure features and high-level linguistic priors.
CVPR 2026
TEXTRIX introduces a native 3D attribute generation framework that bypasses the inconsistencies of multi-view fusion by utilizing a Diffusion Transformer on a latent 3D grid, enabling both high-fidelity, seamless texture synthesis and precise 3D part segmentation.
CVPR 2026 Project Page
SpatialVID provides a massive-scale dataset of over 21,000 hours of in-the-wild videos with dense 3D annotations—including camera poses, depth maps, and motion instructions—to overcome the data scarcity and scalability limitations currently hindering spatial intelligence and 3D vision research.
CVPR 2026 Project Page Code
We address the challenge of speed and generalization in sketch-based 3D pose estimation by employing a learn-from-synthesis strategy. Through training on our established synthetic sketch-pose dataset, we present Sketch2PoseNet that efficiently and accurately predicts generalized 3D human poses across various sketch styles.
SIGGRAPH Asia 2025 Project Page Code
TeRA introduces a highly efficient two-stage 3D human generative framework that outperforms SDS-based models by training a text-controlled latent diffusion model within a structured latent space, enabling fast, photorealistic avatar generation and text-based partial customization.
ICCV 2025 Project Page Code
We introduce a large-scale HUman-centric GEnerated dataset, HuGe100K. Leveraging the diversity in views, poses, and appearances within HuGe100K, we propose a scalable feed-forward transformer model to predict a 3D human Gaussian representation in a uniform space from a given human image.
CVPR 2025 Project Page Code