A curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.

3,271 150 Updated Jun 17, 2026

thunlp / PromptPapers

Must-read papers on prompt-based tuning for pre-trained language models.

4,316 390 Updated Jul 17, 2023

WantongC / journal-adapt-writing-skill

Learn any journal's writing conventions from its published papers, then revise your manuscript to match — section by section.

641 40 Updated May 15, 2026

leofan90 / Awesome-World-Models

A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related webs…

Python 1,806 60 Updated Jun 18, 2026

inspatio / inspatio-world

Python 918 68 Updated Apr 13, 2026

OpenDriveLab / WorldEngine

WorldEngine: Towards the Era of Post-Training for Physical AI

Python 344 18 Updated Jun 17, 2026

whwangovo / pyre-code

A self-hosted ML coding practice platform. 68 problems from ReLU to flow matching — attention, training, RLHF, diffusion, and more. Instant feedback in the browser.

Python 1,143 107 Updated May 12, 2026

cvlab-uos / ViewSplat

Python 19 Updated Apr 7, 2026

cameronosmith / SIRE

Code for SIRE: SE(3) Intrinsic Rigidity Embeddings

Python 3 Updated Nov 26, 2025

cvg / YoNoSplat

[ICLR'26] YoNoSplat: You Only Need One Model for Feedforward 3D Gaussian Splatting

Python 203 10 Updated Feb 25, 2026

jzr99 / Geo4D

[ICCV 2025 Highlight] Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction

Python 434 16 Updated Jun 6, 2025

thu-ml / Causal-Forcing

[ICML 2026] Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation" & Causal Forcing++

Python 790 46 Updated Jun 17, 2026

basilevh / gcd

Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis (ECCV 2024 Oral) - Official Implementation

Python 288 10 Updated Nov 18, 2025

huggingface / smolagents

🤗 smolagents: a barebones library for agents that think in code.

Python 27,932 2,699 Updated Jun 16, 2026

vislearn / LessMore

Learning Less is More - 6D Camera Localization via 3D Surface Regression

C++ 264 49 Updated Sep 14, 2020

AgibotTech / genie_sim

Simulation Platform from AgiBot

Python 1,031 92 Updated May 8, 2026

peckjon / copilot-chat-to-markdown

Tools for converting Copilot chat conversations to markdown format

Python 110 18 Updated Sep 13, 2025

HaoranZhuExplorer / adljepa

[AAAI 2026] AD-L-JEPA: Self-Supervised Representation Learning with Joint Embedding Predictive Architecture for Automotive LiDAR Object Detection

Python 47 8 Updated Nov 18, 2025

liangxiansheng093 / BoRe-Depth

A novel lightweight monocular depth estimation method

Python 41 4 Updated Nov 17, 2025

limbopro / JichangTuijian

毒奶博主的自用机场推荐——100GB/15元/月起(最高享8折优惠)，SS/v2Ray/Trojan协议支持，IEPL专线加持，稳定低延迟，ChatGPT，Netflix等流媒体解锁；

737 12 Updated Jun 13, 2026

limbopro / Paolujichang

科学上网🕸️之跑路机场名单收集（2020-2026），欢迎投稿。Ad🔗🈲🙅❌

6,052 91 Updated May 29, 2026

WangRongsheng / awesome-LLM-resources

🧑‍🚀 全世界最好的LLM资料总结（多模态生成、Agent、辅助编程、AI审稿、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型） | Summary of the world's best LLM resources.

8,554 897 Updated Jun 16, 2026

afshinea / stanford-cme-295-transformers-large-language-models

VIP cheatsheet for Stanford's CME 295 Transformers and Large Language Models

4,500 645 Updated May 25, 2026

heshuting555 / Awesome-3DGS-Applications

【TPAMI 2026】A Survey on 3D Gaussian Splatting Applications: Segmentation, Editing, and Generation

369 17 Updated Jun 17, 2026

skindhu / Build-A-Large-Language-Model-CN

《Build a Large Language Model (From Scratch)》是一本深入探讨大语言模型原理与实现的电子书，适合希望深入了解 GPT 等大模型架构、训练过程及应用开发的学习者。为了让更多中文读者能够接触到这本极具价值的教材，我决定将其翻译成中文，并通过 GitHub 进行开源共享。

Green Mollylulu

Highlights

Lists (32)

3d

3d-reconstruction-gen

4D-spatial

agent-tools

autonomous-driving

basemodel

casuality

contrastive-learning

depth

detection

diffusion

generation

geometry

graph

human_pose

Know-distillation

knowledge

laneDetection

LLM-tools

LLMs/VLMs

multi-modal

online-course resource

open-vocabulary

prompt-related

RL

robotics

segmentation

self/semi-supervised learning

temporal

tools

tranditional_cv

world-model

Starred repositories

stereo-matching

video-instance-segmentation

object-detection

Natural language processing