-
Beihang University
- Shenzhen, China
-
07:38
(UTC +08:00) - https://zhoues.github.io/
Lists (16)
Sort Name ascending (A-Z)
Stars
From Chain-of-Thought prompting to OpenAI o1 and DeepSeek-R1 🍓
Interleaving Reasoning: Next-Generation Reasoning Systems for AGI
A curated list of awesome papers for reconstructing 4D spatial intelligence from video. (arXiv 2507.21045)
Official implementation of "RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics"
Any4D: Unified Feed-Forward Metric 4D Reconstruction
MM-ACT: Learn from Multimodal Parallel Generation to Act
[NeurIPS 25] TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation
The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.
G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
Training VLM agents with multi-turn reinforcement learning
Thinking in 360°: Humanoid Visual Search in the Wild
Code release for https://kovenyu.com/WonderWorld/
GigaWorld-0: World Models as Data Engine to Empower Embodied AI
IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction
The official repo for SpaceVista: All-Scale Visual Spatial Reasoning from mm to km.
[NeurIPS 2025 Spotlight] Official implementation of the SIU3R: Simultaneous Scene Understanding and 3D Reconstruction Beyond Feature Alignment
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
A curated collection of fun and creative examples generated with Nano Banana & Nano Banana Pro🍌, Gemini-2.5-flash-image based model. We also release Nano-consistent-150K openly to support the commu…
Official code for "Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation"
ViPE: Video Pose Engine for Geometric 3D Perception
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning
Official codebase for "Any-point Trajectory Modeling for Policy Learning"
[ICCV 2025] SpatialTrackerV2: 3D Point Tracking Made Easy
[Actively Maintained🔥] A list of Embodied AI papers accepted by top conferences (ICLR, NeurIPS, ICML, RSS, CoRL, ICRA, IROS, CVPR, ICCV, ECCV).