Skip to content
View YangS03's full-sized avatar

Block or report YangS03

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

C++ and Python controller for ARX5 robot arm developed by @yihuai-gao (@real-stanford)

Python 138 29 Updated Mar 26, 2026

SpatialMP4 format cpp/python toolkit.

C++ 10 3 Updated Dec 25, 2025

An all-in-one VLA engineering platform for embodied AI — from data to real-robot deployment.

Python 236 19 Updated Apr 16, 2026

A ROS driver for Insta360 cameras, enabling real-time image capture, processing, and publishing in ROS environments.

Python 220 41 Updated Dec 25, 2025

PyTorch Lightning Optical Flow models, scripts, and pretrained weights.

Python 518 62 Updated Mar 31, 2026

[ICLR2026 - Oral] WAFT: Warping-Alone Field Transforms for Optical Flow

Python 202 20 Updated Mar 26, 2026

Official codebase for Fast-WAM: Do World Action Models Need Test-time Future Imagination?

Python 518 44 Updated Apr 3, 2026

RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI

Python 3,117 405 Updated Apr 16, 2026

One framework to evaluate any VLA model on any robot simulation benchmark.

Python 227 17 Updated Apr 15, 2026

RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots

Python 1,335 184 Updated Apr 15, 2026

[ICLR 2026 Oral] Latent Particle World Models official repository

Jupyter Notebook 82 3 Updated Mar 19, 2026

[ICLR 2026] RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulation

Python 121 6 Updated Feb 14, 2026

Causal video-action world model for generalist robot control

Python 1,006 76 Updated Apr 14, 2026

A Pragmatic VLA Foundation Model

Python 1,057 90 Updated Mar 12, 2026
Python 339 14 Updated Feb 10, 2026

VAE modified from Descript Audio Codec, which replaces the RVQ with VAE

Python 90 10 Updated Apr 2, 2024

DACVAE

Python 214 18 Updated Dec 22, 2025

REALM: A Real-to-Sim Validated Benchmark for Generalization in Robotic Manipulation

Python 49 2 Updated Apr 10, 2026

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,285 111 Updated Mar 2, 2025

Official inference repo for FLUX.1 models

Python 25,413 1,876 Updated Jul 31, 2025

A optimized PyTorch framework for behavior cloning with flow related generative models.

Python 256 11 Updated Mar 26, 2026

Team Comet's 2025 BEHAVIOR Challenge Codebase

Python 241 20 Updated Jan 6, 2026

Distribution Matching Variational AutoEncoder (DMVAE)

Python 49 2 Updated Dec 9, 2025

DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space

Python 366 11 Updated Oct 5, 2025

BEHAVIOR-1K: a platform for accelerating Embodied AI research. Join our Discord for support: https://discord.gg/bccR5vGFEx

Python 1,421 184 Updated Apr 16, 2026

Multimodal Mixture-of-Experts VAE

Python 225 49 Updated Jul 6, 2023

Official Implementations for Paper - MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues

Python 137 14 Updated Dec 3, 2025

[CVPR'2026] "MM-ACT: Learn from Multimodal Parallel Generation to Act"

Python 104 5 Updated Mar 13, 2026

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,684 252 Updated Jan 8, 2026

Official Implementation of "MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation"

Python 298 9 Updated Jan 29, 2026
Next