reyoung

Yang Yu reyoung

I am the NLP/LLM infra leader for WeChat, was a core developer for Paddle. WeChat LLM Infra Team is hiring! Please feel free to email me.

333 followers · 69 following

Tencent
Beijing

Achievements

x2 x3

Achievements

x2 x3

Stars

aikitoria / nanotrace

Low overhead tracing library and trace visualizer for pipelined CUDA kernels

C 105 5 Updated Nov 12, 2025

apple / container

A tool for creating and running Linux containers using lightweight virtual machines on a Mac. It is written in Swift, and optimized for Apple silicon.

Swift 22,152 521 Updated Nov 14, 2025

reyoung / dockersvc

Rust 1 Updated Oct 11, 2025

linka-cloud / d2vm

Build Virtual Machine Image from Dockerfile or Docker image

Go 328 52 Updated Apr 29, 2025

gotoz / runq

run regular Docker images in KVM/Qemu

Go 838 48 Updated Apr 16, 2025

pgjones / hypercorn

Hypercorn is an ASGI and WSGI Server based on Hyper libraries and inspired by Gunicorn.

Python 1,445 133 Updated Nov 8, 2025

KuangjuX / NVSHMEM-Tutorial

NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer

Cuda 142 11 Updated Sep 18, 2025

scylladb / seastar

High performance server-side application framework

C++ 8,965 1,649 Updated Nov 13, 2025

zhuzilin / flash-attention-with-sink

Python 39 1 Updated Aug 7, 2025

stepfun-ai / StepMesh

C++ 316 29 Updated Nov 13, 2025

tidwall / sjson

Set JSON values very quickly in Go

Go 2,656 176 Updated Nov 3, 2025

Dao-AILab / quack

A Quirky Assortment of CuTe Kernels

Python 653 61 Updated Oct 30, 2025

SandAI-org / MagiAttention

A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training

Python 554 33 Updated Nov 14, 2025

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 2,471 252 Updated Nov 14, 2025

DLR-RM / stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Python 12,036 1,989 Updated Oct 31, 2025

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 3,918 313 Updated Nov 14, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,930 286 Updated May 15, 2025

microsoft / Tutel

Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4

C 938 106 Updated Nov 10, 2025

microsoft / mscclpp

MSCCL++: A GPU-driven communication stack for scalable AI applications

C++ 434 72 Updated Nov 14, 2025

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Jupyter Notebook 913 44 Updated Oct 29, 2025

reugn / async

Synchronization and asynchronous computation package for Go

Go 279 15 Updated Jul 5, 2025

chalk-diagrams / chalk

A declarative drawing API in Python

Python 298 15 Updated Aug 28, 2024

SkyworkAI / skywork-o1-prm-inference

Python 65 7 Updated Nov 26, 2024

Haiyang-W / TokenFormer

[ICLR2025 Spotlight🔥] Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Python 576 43 Updated Feb 11, 2025

borgo-lang / borgo

Borgo is a statically typed language that compiles to Go.

Rust 4,481 64 Updated Oct 27, 2024

firecrawl / firecrawl

🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data

TypeScript 67,708 5,259 Updated Nov 14, 2025

antgroup / glake

GLake: optimizing GPU memory management and IO transmission.

Python 489 44 Updated Mar 24, 2025

turboderp-org / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs

Python 4,363 324 Updated Aug 16, 2025

zhuzilin / ring-flash-attention

Ring attention implementation with flash attention

Python 909 88 Updated Sep 10, 2025

ejoy / ant

Ant game engine

Lua 3,918 405 Updated Mar 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yang Yu reyoung

Achievements

Achievements

Block or report reyoung

Stars

aikitoria / nanotrace

apple / container

reyoung / dockersvc

linka-cloud / d2vm

gotoz / runq

pgjones / hypercorn

KuangjuX / NVSHMEM-Tutorial

scylladb / seastar

zhuzilin / flash-attention-with-sink

stepfun-ai / StepMesh

tidwall / sjson

Dao-AILab / quack

SandAI-org / MagiAttention

THUDM / slime

DLR-RM / stable-baselines3

tile-ai / tilelang

deepseek-ai / open-infra-index

microsoft / Tutel

microsoft / mscclpp

efeslab / Nanoflow

reugn / async

chalk-diagrams / chalk

SkyworkAI / skywork-o1-prm-inference

Haiyang-W / TokenFormer

borgo-lang / borgo

firecrawl / firecrawl

antgroup / glake

turboderp-org / exllamav2

zhuzilin / ring-flash-attention

ejoy / ant