Skip to content
View YangS03's full-sized avatar

Block or report YangS03

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

PyTorch Lightning Optical Flow models, scripts, and pretrained weights.

Python 515 61 Updated Mar 31, 2026

[ICLR2026 - Oral] WAFT: Warping-Alone Field Transforms for Optical Flow

Python 198 19 Updated Mar 26, 2026

Official codebase for Fast-WAM: Do World Action Models Need Test-time Future Imagination?

Python 366 21 Updated Apr 3, 2026

RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI

Python 2,988 380 Updated Apr 4, 2026

One framework to evaluate any VLA model on any robot simulation benchmark.

Python 191 13 Updated Apr 4, 2026

RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots

Python 1,312 180 Updated Mar 18, 2026

[ICLR 2026 Oral] Latent Particle World Models official repository

Jupyter Notebook 75 3 Updated Mar 19, 2026

[ICLR 2026] RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulation

Python 108 4 Updated Feb 14, 2026

Causal video-action world model for generalist robot control

Python 952 66 Updated Feb 27, 2026

A Pragmatic VLA Foundation Model

Python 1,019 88 Updated Mar 12, 2026
Python 338 14 Updated Feb 10, 2026

VAE modified from Descript Audio Codec, which replaces the RVQ with VAE

Python 90 8 Updated Apr 2, 2024

DACVAE

Python 209 17 Updated Dec 22, 2025

REALM: A Real-to-Sim Validated Benchmark for Generalization in Robotic Manipulation

Python 49 2 Updated Apr 1, 2026

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,282 110 Updated Mar 2, 2025

Official inference repo for FLUX.1 models

Python 25,373 1,872 Updated Jul 31, 2025

A optimized PyTorch framework for behavior cloning with flow related generative models.

Python 253 10 Updated Mar 26, 2026

Team Comet's 2025 BEHAVIOR Challenge Codebase

Python 235 19 Updated Jan 6, 2026

Distribution Matching Variational AutoEncoder (DMVAE)

Python 49 2 Updated Dec 9, 2025

DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space

Python 364 11 Updated Oct 5, 2025

BEHAVIOR-1K: a platform for accelerating Embodied AI research. Join our Discord for support: https://discord.gg/bccR5vGFEx

Python 1,397 179 Updated Apr 5, 2026

Multimodal Mixture-of-Experts VAE

Python 223 50 Updated Jul 6, 2023

Official Implementations for Paper - MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues

Python 133 12 Updated Dec 3, 2025

[CVPR'2026] "MM-ACT: Learn from Multimodal Parallel Generation to Act"

Python 101 5 Updated Mar 13, 2026

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,640 247 Updated Jan 8, 2026

Official Implementation of "MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation"

Python 295 8 Updated Jan 29, 2026

Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies

Python 60 4 Updated Dec 3, 2025

EO: Open-source Unified Embodied Foundation Model Series

Jupyter Notebook 52 3 Updated Jan 15, 2026

SAM 3D Objects

Python 6,366 745 Updated Mar 12, 2026
Python 1,001 85 Updated Jan 25, 2026
Next