tsaoyu

Follow

🚀

Working

Tony Yu Cao tsaoyu

🚀

Working

Follow

LLM, Reinforcement Learning, Robotics

97 followers · 39 following

https://www.tsaoyu.com

Achievements

Achievements

Organizations

Lists (1)

Sort

Distributed-computing

The future of distributed computing

Starred repositories

google-deepmind / penzai

A JAX research toolkit for building, editing, and visualizing neural networks.

Python 1,894 71 Updated Jun 22, 2025

obalcells / hallucination_probes

Real-Time Detection of Hallucinated Entities in Long-Form Generation

Python 289 30 Updated Nov 16, 2025

yaof20 / Flash-RL

Implementation for FP8/INT8 Rollout for RL training without performence drop.

Python 304 23 Updated Nov 7, 2025

samsja / muon_fsdp_2

Muon fsdp 2

Python 62 7 Updated Aug 8, 2025

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 6,656 960 Updated Jun 21, 2026

PiotrNawrot / sparse-frontier

The evaluation framework for training-free sparse attention in LLMs

Python 123 12 Updated Jan 27, 2026

Multiverse4FM / Multiverse-Engine

Customized Inference Engine for Multiverse Models

Python 25 2 Updated Jun 27, 2025

HazyResearch / cartridges

Storing long contexts in tiny caches with self-study

Python 276 38 Updated Mar 23, 2026

ScalingIntelligence / tokasaurus

Python 478 38 Updated Nov 25, 2025

alibaba / ROLL

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 3,254 290 Updated Jun 22, 2026

NVIDIA / gdrcopy

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

C 1,391 189 Updated Jun 15, 2026

transformerlab / transformerlab-app

The open source research environment for AI researchers to seamlessly train, evaluate, and scale models from local hardware to GPU clusters.

Python 5,107 535 Updated Jun 20, 2026

microsoft / SeerAttention

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Python 204 20 Updated Jun 10, 2026

rdnfn / icai

Inverse Constitutional AI [ICLR 2025]: compressing pairwise preference data into a short constitution of principles.

Python 41 7 Updated May 6, 2026

simular-ai / Agent-S

Agent S: an open agentic framework that uses computers like a human

Python 11,902 1,402 Updated May 13, 2026

MathFoundationRL / Book-Mathematical-Foundation-of-Reinforcement-Learning

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 16,533 1,565 Updated May 26, 2026

SwanHubX / SwanLab

⚡️SwanLab - an open-source, modern-design AI training tracking and visualization tool. Supports Cloud / Self-hosted use. Integrated with PyTorch / Transformers / verl / LLaMA Factory / ms-swift / U…

Python 4,010 210 Updated Jun 22, 2026

steel-dev / steel-browser

🔥 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser sandbox that lets you automate the web without worrying about infrastructure.

TypeScript 7,203 938 Updated Jun 9, 2026

fla-org / native-sparse-attention

🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

Python 1,006 53 Updated Feb 5, 2026

dhealy05 / frames_of_mind

Animating R1's thoughts.

Python 380 11 Updated Feb 17, 2025

facebookresearch / LeanUniverse

LeanUniverse: A Library for Consistent and Scalable Lean4 Dataset Management

Python 77 5 Updated Jan 15, 2025

aliyun / SimAI

Python 1,014 176 Updated Apr 24, 2026

ServiceNow / BrowserGym

🌎💪 BrowserGym, a Gym environment for web task automation

Python 1,256 177 Updated Mar 17, 2026

zai-org / GLM-4-Voice

GLM-4-Voice | 端到端中英语音对话模型

Python 3,194 281 Updated Dec 5, 2024

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,562 310 Updated Nov 5, 2024

facefusion / facefusion

Industry leading face manipulation platform

Python 29,021 4,719 Updated Jun 22, 2026

alexrame / rewardedsoups

Rewarded soups official implementation

HTML 64 9 Updated Sep 27, 2023

openinterpreter / openinterpreter

A lightweight coding agent for open models like Deepseek, Kimi, and Qwen

Rust 64,085 5,556 Updated Jun 20, 2026

hunterirving / macproxy_plus

browse the modern web on vintage computers

Python 220 19 Updated Mar 18, 2026

openai / swarm

Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.

Python 21,657 2,311 Updated Apr 15, 2026

Starred topics

gazebo

Robotics

sailing