Stars
VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model
ACMMM2024: Wave-Mamba: Wavelet State Space Model for Ultra-High-Definition Low-Light Image Enhancement
CVPR 2025 | Every SAM Drop Counts: Embracing Semantic Priors for Multi-Modality Image Fusion and Beyond
Repository for "Visible-Thermal Tiny Object Detection: A Benchmark Dataset and Baselines"
CVPR 2025 - V2X-R: Cooperative LiDAR-4D Radar Fusion for 3D Object Detection with Denoising Diffusion
AAAI2025 Oral - L4DR: LiDAR-4DRadar Fusion for Weather-Robust 3D Object Detection
ChatDev 2.0: Dev All through LLM-powered Multi-Agent Collaboration
Real-Time VLAs via Future-state-aware Asynchronous Inference.
siiRL: Shanghai Innovation Institute RL Framework for Advanced LLMs and Multi-Agent Systems
[CVPR'2026] "MM-ACT: Learn from Multimodal Parallel Generation to Act"
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
[Paper][EMNLP 2025] RTQA : Recursive Thinking for Complex Temporal Knowledge Graph Question Answering with Large Language Models
MrlX: A Multi-Agent Reinforcement Learning Framework
[ICLR'26] MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs
NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards
Official implementation of "Data Scaling Laws in Imitation Learning for Robotic Manipulation"
Official implementation of "OneTwoVLA: A Unified Vision-Language-Action Model with Adaptive Reasoning"
[ICCV 2025] VLDrive: Vision-Augmented Lightweight MLLMs for Efficient Language-grounded Autonomous Driving
Joycon-Robotics: Low-Cost, Convenient Teleoperation for One- and Two-Arm Robots
U-Arm: Lerobot-Everything-Cross-Embodiment-Teleoperation
Evo-1: Lightweight Vision-Language-Action Model with Preserved Semantic Alignment
DriveVLA-W0: World Models Amplify Data Scaling Law in Autonomous Driving (ICLR 2026)
[ICLR 2026] Unified Vision-Language-Action Model
🔥 The first open-sourced diffusion vision-langauge-action model. [ICLR 2026]
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Scalable RL solution for advanced reasoning of language models
[ICLR 2026] SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning