NVIDIA Cosmos is an open platform of world models, datasets, and tools that enables developers to build Physical AI for robots, autonomous vehicles, smart infrastructure, and more.

Jupyter Notebook 10,494 702 Updated Jun 23, 2026

RoboTwin-Platform / RoboTwin

RoboTwin 2.0 Offical Repo

Python 2,475 404 Updated May 23, 2026

dreamzero0 / dreamzero

Code to pretrain, fine-tune, and evaluate DreamZero and run sim & real-world evals

Python 2,305 198 Updated Apr 19, 2026

datajuicer / data-juicer

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Python 6,571 383 Updated Jun 23, 2026

multica-ai / andrej-karpathy-skills

A single CLAUDE.md file to improve Claude Code behavior, derived from Andrej Karpathy's observations on LLM coding pitfalls.

180,643 18,483 Updated Apr 20, 2026

EvolvingLMMs-Lab / LLaVA-OneVision-2

Fully Open Framework for Democratized Multimodal Training

Python 1,098 75 Updated Jun 23, 2026

zli12321 / Vision-SR1

Reinforcement Learning of Vision Language Models with Self Visual Perception Reward

Python 174 17 Updated Mar 14, 2026

OpenMOSS / Awesome-WAM

A curated, continuously updated reading list, paper blogs, and resources for World Action Models (WAMs) in embodied AI.

HTML 914 23 Updated Jun 21, 2026

NTUMARS / Awesome-World-Model-for-Robotics-Policy

633 14 Updated May 16, 2026

QwenLM / Qwen-VLA

The official repository of Qwen-VLA

631 25 Updated May 29, 2026

neilsonnn / image-blaster

An image-to-world skillset for Claude.

TypeScript 4,623 465 Updated May 15, 2026

lucas-maes / le-wm

Official code base for LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels

Python 3,921 547 Updated May 26, 2026

tajwarfahim / maxrl

Official Implementation of "Maximum Likelihood Reinforcement Learning (MaxRL)"

Python 189 29 Updated May 28, 2026

rabeya-akter / CountOCC

Implementation of the paper "Counting Through Occlusion: Framework for Open World Amodal Counting"

Python 1 Updated Nov 16, 2025

atinpothiraj / CAPTURe

CAPTURe: Evaluating Spatial Reasoning in Vision Language Models via Occluded Object Counting

Python 10 Updated Apr 23, 2025

yix8 / VisualPlanning

[ICLR 2026 Oral] Visual Planning: Let's Think Only with Images

Python 362 12 Updated Apr 24, 2026

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 19,389 2,481 Updated May 30, 2026

ByteDance-Seed / Depth-Anything-3

Depth Anything 3

Python 5,610 621 Updated Mar 21, 2026

DepthAnything / Depth-Anything-V2

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 8,319 862 Updated Mar 24, 2026

LiheYoung / Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Python 8,124 613 Updated Jul 17, 2024

junha1125 / PIVOT

Official implementation of "RL Makes MLLMs See Better Than SFT"

Python 7 Updated Apr 10, 2026

Qinym Wakals

Highlights

Lists (9)

books

CG Phy

CV

ML

N LP

Papers

RL

人工智能实战

课程

Stars