This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

1,251 58 Updated Oct 18, 2025

GT-RIPL / Awesome-LLM-Robotics

A comprehensive list of papers using large language/multi-modal models for Robotics/RL, including papers, codes, and related websites

4,079 318 Updated Oct 17, 2025

wsakobe / TrackVLA

[CoRL 2025] Repository relating to "TrackVLA: Embodied Visual Tracking in the Wild"

Python 265 18 Updated Oct 16, 2025

jzhzhang / NaVid-VLN-CE

[RSS 2024 & RSS 2025] VLN-CE evaluation code of NaVid and Uni-NaVid

Python 296 20 Updated Oct 15, 2025

FlagOpen / RoboBrain2.0

RoboBrain 2.0: Advanced version of RoboBrain. See Better. Think Harder. Do Smarter. 🎉🎉🎉

Python 677 57 Updated Sep 30, 2025

thu-ml / RoboticsDiffusionTransformer

RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation

Python 1,515 145 Updated Sep 28, 2025

deepinsight / insightface

State-of-the-art 2D and 3D Face Analysis Project

Python 26,958 5,814 Updated Sep 27, 2025

zwq2018 / embodied_reasoner

Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks

Python 178 15 Updated Sep 24, 2025

OpenBMB / MiniCPM-V

MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

Python 22,190 1,665 Updated Sep 24, 2025

FlagOpen / RoboOS

🤖 RoboOS: A Universal Embodied Operating System for Cross-Embodied and Multi-Robot Collaboration

Python 239 28 Updated Sep 4, 2025

yang-zj1026 / NaVILA-Bench

Vision-Language Navigation Benchmark in Isaac Lab

Python 261 24 Updated Aug 28, 2025

serengil / retinaface

RetinaFace: Deep Face Detection Library for Python

Python 1,762 180 Updated Aug 11, 2025

vision-x-nyu / thinking-in-space

Official repo and evaluation implementation of VSI-Bench

Python 617 37 Updated Aug 5, 2025

declare-lab / Emma-X

Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning

Python 75 6 Updated May 17, 2025

Hritikbansal / videophy

Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics

Python 155 10 Updated May 6, 2025

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 27,770 2,755 Updated Apr 30, 2025

MichalZawalski / embodied-CoT

Forked from openvla/openvla

Embodied Chain of Thought: A robotic policy that reason to solve the task.

Python 322 16 Updated Apr 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wei-Baldwin-Zeng Wei-Baldwin-Zeng

Block or report Wei-Baldwin-Zeng

Stars

datajuicer / data-juicer

OpenDriveLab / UniVLA

haosulab / ManiSkill

zai-org / CogVideo

mikel-brostrom / boxmot

GigaAI-research / General-World-Models-Survey

zchoi / Awesome-Embodied-Robotics-and-Agent

zhangyuejoslin / VLN-Survey-with-Foundation-Models

showlab / Show-o

TencentARC / SEED-Voken

agiresearch / A-mem

MIV-XJTU / ARTrack

NVlabs / VILA

Sun-Haoyuan23 / Awesome-RL-based-Reasoning-MLLMs