- Hong Kong SAR
Starred repositories
Causal video-action world model for generalist robot control
[ICLR 2026] Plan-R1: Safe and Feasible Trajectory Planning as Language Modeling
AgentFlow: In-the-Flow Agentic System Optimization
Running VLA at 30Hz frame rate and 480Hz trajectory frequency
A general-purpose robotic agent framework based on LLMs. The LLM can independently reason, plan, and execute actions to operate diverse robot types across various scenarios to complete unpredictabl…
Offical code release for DynoSAM: Dynamic Object Smoothing And Mapping. Accepted Transactions on Robotics (Visual SLAM SI). A visual SLAM framework and pipeline for Dynamic environements, estimatin…
1st place solution of 2025 BEHAVIOR Challenge
Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"
This repository contains code for the paper "Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training" by T. Bonnaire, R. Urfin, G. Biroli and M. Mézard.
(ICRA 2025) Inverse Mixed Strategy Games with Generative Trajectory Models
A highly robust and accurate LiDAR-only, LiDAR-inertial odometry
Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"
Visual Imitation Enables Contextual Humanoid Control. CoRL 2025, Best Student Paper Award.
Official Repo of "Disentangled Reinforcement Learning for Robust Visual Quality Assessment"
[NeurIPS'25 Spotlight] (DANCE) Disentangled Concepts Speak Louder Than Words:Explainable Video Action Recognition official code repository
[NeurIPS 2025] NAUTILUS: A Large Multimodal Model for Underwater Scene Understanding
AI Trading OS: Multi-AI, multi-exchange trading infrastructure with Strategy Studio.
Official implementation of paper: Characterizing Dataset Bias via Disentangled Visual Concepts
StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing
[NeurIPS 2025] Flow x RL. "ReinFlow: Fine-tuning Flow Policy with Online Reinforcement Learning". Support VLAs e.g., pi0, pi0.5. Fully open-sourced.
This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.
[ICCV 2025] SuperDec: 3D Scene Decomposition with Superquadric Primitives.
GPU-Powered Sequential Manipulation in Milliseconds
The official repo for SpaceVista: All-Scale Visual Spatial Reasoning from mm to km.
Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"
[ICCV 2025] RoboAnnotatorX: A Comprehensive and Universal Annotation Framework for Accurate Understanding of Long-horizon Robot Demonstration
Subtask-Aware Visual Reward Learning from Segmented Demonstrations (ICLR 2025 accepted)