-
UMass Amherst
- Beijing, China
-
00:36
(UTC -04:00) - haoyuzhen.com
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
[CVPR 2026] UniDex: A Robot Foundation Suite for Universal Dexterous Hand Control from Egocentric Human Videos
Reimplementation of LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory
WorldArena: A Unified Benchmark for Evaluating Perception and Functional Utility of Embodied World Models
An all-in-one humanoid research platform on top of Genesis.
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
[NeurIPS 2025] Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
A Fair and Scalable Time Series Forecasting Benchmark and Toolkit.
[ICLR 2026] Official repo for paper "Video-As-Prompt: Unified Semantic Control for Video Generation"
A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.
[ICLR 2026] Trace Anything: Representing Any Video in 4D via Trajectory Fields
Awesome paper list and repos of the paper "A comprehensive survey of embodied world models".
[NeurIPS DB 2025] IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering
4DNeX: Feed-Forward 4D Generative Modeling Made Easy
Reference PyTorch implementation and models for DINOv3
This is the official repo for [CVPR 2025] paper, Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation. https://jiaming-zhou.github.io/projects/HumanRobotAlign/
ViPE: Video Pose Engine for Geometric 3D Perception
Official code for EWMBench: Evaluating Scene, Motion, and Semantic Quality in Embodied World Models
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
[ICCV 2025] 💐 Dense Policy: Bidirectional Autoregressive Learning of Actions 🚀DSP
Wan: Open and Advanced Large-Scale Video Generative Models
Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching
Collect some World Models for Autonomous Driving (and Robotic, etc.) papers.
(CVPR 2025 Highlight) The Scene Language: Representing Scenes with Programs, Words, and Embeddings
[ICLR 2026] π^3: Permutation-Equivariant Visual Geometry Learning
[NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"