Stars
source code and trained models for DeFM (Depth Foundation Model)
Unofficial implementation of the Dreamer 4 world model in PyTorch.
An open source library designed to provide community examples of Joint Embedding Predictive Architectures (JEPAs). It contains code and examples for learning representations from images, video, and…
Code for "GVHMR: World-Grounded Human Motion Recovery via Gravity-View Coordinates", Siggraph Asia 2024
EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video
An open source environment for digital agents.
Cambrian-S: Towards Spatial Supersensing in Video
A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.
Reliable, minimal and scalable library for pretraining foundation and world models
Hierarchical Reasoning Model Official Release
PyTorch Code for Energy-Based Transformers paper -- generalizable reasoning and scalable learning
A minimal, single-file implementation of the MeanFlow paper on 2D toy examples, with a side-by-side comparison to rectified flow.
Visual Imitation Enables Contextual Humanoid Control. CoRL 2025, Best Student Paper Award.
Nymeria: a massive collection of multimodal egocentric daily motion in the wild
A Modular Toolkit for Robot Kinematic Optimization
[RSS 2024] AdaptiGraph: Material-Adaptive Graph-Based Neural Dynamics for Robotic Manipulation
Code for "Scaling Language-Free Visual Representation Learning" paper (Web-SSL).
Collect some World Models for Autonomous Driving (and Robotic, etc.) papers.
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
Official code for the CVPR 2025 paper "Navigation World Models".
Official code repository for "Web Agents with World Models [ICLR 2025]".
Code for Scaling Language-Free Visual Representation Learning (WebSSL)
ICLR'25 Oral: Improving Probabilistic Diffusion Models With Optimal Covariance Matching
HaMeR: Reconstructing Hands in 3D with Transformers
[ICLR 2025] 6D Object Pose Tracking in Internet Videos for Robotic Manipulation
Muon is an optimizer for hidden layers in neural networks