yongwww

🐢

working

Yong Wu yongwww

🐢

working

MLSys Engineer @ Nvidia | FlashInfer and Machine Learning Compiler LLM co-design

103 followers · 86 following

@NVIDIA
Redmond, WA
13:32 (UTC -07:00)

Achievements

x3 x3

Achievements

x3 x3

Highlights

Organizations

flashinfer Public
Forked from flashinfer-ai/flashinfer

FlashInfer: Kernel Library for LLM Serving

Python Apache License 2.0 Updated Mar 31, 2026
tvm Public
Forked from apache/tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Python Apache License 2.0 Updated Mar 25, 2026
openclaw Public
Forked from openclaw/openclaw

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript MIT License Updated Mar 20, 2026
tvm-ffi Public
Forked from apache/tvm-ffi

TVM FFI

C++ Apache License 2.0 Updated Feb 24, 2026
ci Public
Forked from tlc-pack/ci

Repository which handles configuration of TVM CI infrastructure.

Python Apache License 2.0 Updated Feb 10, 2026
ci-infra Public
Forked from flashinfer-ai/ci-infra

Shell Apache License 2.0 Updated Feb 7, 2026
terraform-aws-github-runner Public
Forked from github-aws-runners/terraform-aws-github-runner

Terraform module for scalable GitHub action runners on AWS

TypeScript MIT License Updated Jan 24, 2026
vibetensor Public
Forked from NVlabs/vibetensor

Our first fully AI generated deep learning system

Python Apache License 2.0 Updated Jan 22, 2026
flashinfer-bench Public
Forked from flashinfer-ai/flashinfer-bench

Building the Virtuous Cycle for AI-driven LLM Systems

Python Apache License 2.0 Updated Dec 17, 2025
dynamo Public
Forked from ai-dynamo/dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust Apache License 2.0 Updated Dec 10, 2025
NeMo Public
Forked from NVIDIA-NeMo/NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python Apache License 2.0 Updated Dec 10, 2025
sglang Public
Forked from sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.

Python Apache License 2.0 Updated Dec 9, 2025
tilelang Public
Forked from tile-ai/tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ Other Updated Oct 17, 2025
cutlass Public
Forked from NVIDIA/cutlass

CUDA Templates for Linear Algebra Subroutines

C++ Other Updated Oct 1, 2025
cutlass_fpA_intB_gemm Public
Forked from tlc-pack/cutlass_fpA_intB_gemm

A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer

C++ Apache License 2.0 Updated Aug 7, 2025
mlc-llm Public
Forked from mlc-ai/mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Python Apache License 2.0 Updated Jul 25, 2025
Genesis Public
Forked from Genesis-Embodied-AI/Genesis

A generative world for general-purpose robotics & embodied AI learning.

Python Apache License 2.0 Updated Apr 15, 2025
xgrammar Public
Forked from mlc-ai/xgrammar

C++ Apache License 2.0 Updated Nov 8, 2024
triton Public
Forked from triton-lang/triton

Development repository for the Triton language and compiler

C++ MIT License Updated Sep 20, 2024
diffusers Public
Forked from huggingface/diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Python Apache License 2.0 Updated Jun 13, 2024
rust Public
Forked from rust-lang/rust

Empowering everyone to build reliable and efficient software.

Rust Other Updated Jun 5, 2024
gpt-fast Public
Forked from meta-pytorch/gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python BSD 3-Clause "New" or "Revised" License Updated May 8, 2024
vllm Public
Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python Apache License 2.0 Updated May 3, 2024
llm.c Public
Forked from karpathy/llm.c

LLM training in simple, raw C/CUDA

Cuda MIT License Updated May 3, 2024
transformers Public
Forked from huggingface/transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python Apache License 2.0 Updated Feb 27, 2024
package Public
Forked from mlc-ai/package

Shell Apache License 2.0 Updated Oct 19, 2023
web-llm Public
Forked from mlc-ai/web-llm

Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.

Python Apache License 2.0 Updated May 18, 2023
stablehlo Public
Forked from openxla/stablehlo

Backward compatible ML compute opset inspired by HLO/MHLO

MLIR Apache License 2.0 Updated Mar 22, 2023
jax Public
Forked from jax-ml/jax

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python Apache License 2.0 Updated Mar 9, 2023
relax Public
Forked from tlc-pack/relax

Temp repo for prototyping relax(relay next), the effort will be upstreamed. We use the wiki pages on this repo to host design docs.

Python Apache License 2.0 Updated Feb 23, 2023

Yong Wu yongwww

Achievements

Achievements

Highlights

Organizations

flashinfer Public

Uh oh!

tvm Public

Uh oh!

openclaw Public

Uh oh!

tvm-ffi Public

Uh oh!

ci Public

Uh oh!

ci-infra Public

Uh oh!

terraform-aws-github-runner Public

Uh oh!

vibetensor Public

Uh oh!

flashinfer-bench Public

Uh oh!

dynamo Public

Uh oh!

NeMo Public

Uh oh!

sglang Public

Uh oh!

tilelang Public

Uh oh!

cutlass Public

Uh oh!

cutlass_fpA_intB_gemm Public

Uh oh!

mlc-llm Public

Uh oh!

Genesis Public

Uh oh!

xgrammar Public

Uh oh!

triton Public

Uh oh!

diffusers Public

Uh oh!

rust Public

Uh oh!

gpt-fast Public

Uh oh!

vllm Public

Uh oh!

llm.c Public

Uh oh!

transformers Public

Uh oh!

package Public

Uh oh!

web-llm Public

Uh oh!

stablehlo Public

Uh oh!

jax Public

Uh oh!

relax Public

Uh oh!