YangS03

Follow

YangS03

Follow

9 followers · 3 following

Achievements

Achievements

Stars

real-stanford / arx5-sdk

C++ and Python controller for ARX5 robot arm developed by @yihuai-gao (@real-stanford)

Python 138 29 Updated Mar 26, 2026

Pico-Developer / SpatialMP4

SpatialMP4 format cpp/python toolkit.

C++ 10 3 Updated Dec 25, 2025

FluxVLA / FluxVLA

An all-in-one VLA engineering platform for embodied AI — from data to real-robot deployment.

Python 236 19 Updated Apr 16, 2026

ai4ce / insta360_ros_driver

A ROS driver for Insta360 cameras, enabling real-time image capture, processing, and publishing in ROS environments.

Python 220 41 Updated Dec 25, 2025

hmorimitsu / ptlflow

PyTorch Lightning Optical Flow models, scripts, and pretrained weights.

Python 518 62 Updated Mar 31, 2026

princeton-vl / WAFT

[ICLR2026 - Oral] WAFT: Warping-Alone Field Transforms for Optical Flow

Python 202 20 Updated Mar 26, 2026

yuantianyuan01 / FastWAM

Official codebase for Fast-WAM: Do World Action Models Need Test-time Future Imagination?

Python 518 44 Updated Apr 3, 2026

RLinf / RLinf

RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI

Python 3,117 405 Updated Apr 16, 2026

allenai / vla-evaluation-harness

One framework to evaluate any VLA model on any robot simulation benchmark.

Python 227 17 Updated Apr 15, 2026

robocasa / robocasa

RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots

Python 1,335 184 Updated Apr 15, 2026

taldatech / lpwm

[ICLR 2026 Oral] Latent Particle World Models official repository

Jupyter Notebook 82 3 Updated Mar 19, 2026

InternRobotics / RoboInter

[ICLR 2026] RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulation

Python 121 6 Updated Feb 14, 2026

Robbyant / lingbot-va

Causal video-action world model for generalist robot control

Python 1,006 76 Updated Apr 14, 2026

Robbyant / lingbot-vla

A Pragmatic VLA Foundation Model

Python 1,057 90 Updated Mar 12, 2026

EGalahad / vla-scratch

Python 339 14 Updated Feb 10, 2026

innnky / descript-audio-vae

VAE modified from Descript Audio Codec, which replaces the RVQ with VAE

Python 90 10 Updated Apr 2, 2024

facebookresearch / dacvae

DACVAE

Python 214 18 Updated Dec 22, 2025

martin-sedlacek / REALM

REALM: A Real-to-Sim Validated Benchmark for Generalization in Robotic Manipulation

Python 49 2 Updated Apr 10, 2026

jishengpeng / WavTokenizer

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,285 111 Updated Mar 2, 2025

black-forest-labs / flux

Official inference repo for FLUX.1 models

Python 25,413 1,876 Updated Jul 31, 2025

simchowitzlabpublic / much-ado-about-noising

A optimized PyTorch framework for behavior cloning with flow related generative models.

Python 256 11 Updated Mar 26, 2026

mli0603 / openpi-comet

Team Comet's 2025 BEHAVIOR Challenge Codebase

Python 241 20 Updated Jan 6, 2026

sen-ye / dmvae

Distribution Matching Variational AutoEncoder (DMVAE)

Python 49 2 Updated Dec 9, 2025

dc-ai-projects / DC-Gen

DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space

Python 366 11 Updated Oct 5, 2025

StanfordVL / BEHAVIOR-1K

BEHAVIOR-1K: a platform for accelerating Embodied AI research. Join our Discord for support: https://discord.gg/bccR5vGFEx

Python 1,421 184 Updated Apr 16, 2026

iffsid / mmvae

Multimodal Mixture-of-Experts VAE

Python 225 49 Updated Jul 6, 2023

zliucz / MagicQuillV2

Official Implementations for Paper - MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues

Python 137 14 Updated Dec 3, 2025

HHYHRHY / MM-ACT

[CVPR'2026] "MM-ACT: Learn from Multimodal Parallel Generation to Act"

Python 104 5 Updated Mar 13, 2026

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,684 252 Updated Jan 8, 2026

tyfeld / MMaDA-Parallel

Official Implementation of "MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation"

Python 298 9 Updated Jan 29, 2026