-
UESTC, Nio Inc., Alibaba Group
-
12:11
(UTC +08:00) - https://xhghhh.github.io/
Highlights
- Pro
Stars
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
The official implementation of Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
Sharp Monocular View Synthesis in Less Than a Second
CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
[NeurIPS 2025] Official implementation for "Flow Matching-Based Autonomous Driving Planning with Advanced Interactive Behavior Modeling"
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
A Foundation Model for Generalist Gaming Agents
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
DGGT: Feedforward 4D Reconstruction of Dynamic Driving Scenes using Unposed Images
Official implementation of "From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction"
[ECCV 2024] Officially implement of the paper "DrivingDiffusion: Layout-Guided Multi-View Driving Scenarios Video Generation with Latent Diffusion Model".
[WACV 2024 Survey Paper] Multimodal Large Language Models for Autonomous Driving
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
Unfied World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets
Bird-eye's view for CARLA simulator
一体化网页笔记批注、协作与专注辅助工具。 All-in-one Chrome extension for annotated learning, real-time collaboration, and reading focus tools.
PyTorch code for the paper "Model-Based Imitation Learning for Urban Driving".