tqchen

🎯

Focusing

Tianqi Chen tqchen

🎯

Focusing

Machine Learning and Systems

12.5k followers · 128 following

CMU, NVIDIA
https://tqchen.com/

Achievements

x3 x4 x4

Achievements

x3 x4 x4

Highlights

Organizations

Stars

NVlabs / vibetensor

Our first fully AI generated deep learning system

Python 531 37 Updated Feb 2, 2026

sgl-project / mini-sglang

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 3,491 435 Updated Feb 11, 2026

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 22,261 2,383 Updated Feb 16, 2026

NVIDIA / cutile-python

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

Python 1,925 116 Updated Feb 13, 2026

radixark / miles

Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.

Python 880 109 Updated Feb 15, 2026

perplexityai / pplx-garden

Perplexity open source garden for inference technology

Rust 364 28 Updated Dec 25, 2025

flashinfer-ai / flashinfer-bench

Building the Virtuous Cycle for AI-driven LLM Systems

Python 176 26 Updated Feb 13, 2026

NVIDIA / jax-tvm-ffi

JAX support for tvm-ffi abi

C++ 23 3 Updated Dec 10, 2025

apache / tvm-ffi

Open ABI and FFI for Machine Learning Systems

C++ 346 60 Updated Feb 16, 2026

meta-pytorch / BackendBench

Ship correct and fast LLM kernels to PyTorch

Python 142 17 Updated Jan 14, 2026

pytorch / helion

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 749 103 Updated Feb 16, 2026

flashinfer-ai / cubloaty

a size profiler for cuda binary

Python 72 Updated Jan 15, 2026

astral-sh / uv

An extremely fast Python package and project manager, written in Rust.

Rust 79,286 2,566 Updated Feb 16, 2026

msaroufim / pytorch-load-inline-highlighter

VS Code extension for syntax highlighting C++/CUDA/HIP code in PyTorch load_inline() strings

Python 9 Updated Jul 25, 2025

data-apis / array-api

RFC document, tooling and other content related to the array API standard

Python 264 54 Updated Feb 5, 2026

agentsmd / agents.md

AGENTS.md — a simple, open format for guiding coding agents

TypeScript 17,449 1,236 Updated Dec 19, 2025

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 12,886 2,101 Updated Feb 16, 2026

pypa / cibuildwheel

🎡 Build Python wheels for all the platforms with minimal configuration.

Python 2,186 299 Updated Feb 14, 2026

scikit-build / scikit-build-core

A next generation Python CMake adaptor and Python API for plugins

Python 442 81 Updated Feb 16, 2026

NVIDIA / tilus

Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.

Python 444 15 Updated Feb 4, 2026

mshr-h / tvm-relax-cpp-example

Minimum example for deploying Apache TVM's Relax IR using C++ API

C++ 5 Updated Nov 29, 2025

Infini-AI-Lab / Multiverse

Python 113 10 Updated Sep 13, 2025

NVIDIA / jaxpp

JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training

Python 64 1 Updated Feb 13, 2026

ByteDance-Seed / Triton-distributed

Distributed Compiler based on Triton for Parallel Systems

Python 1,358 127 Updated Feb 13, 2026

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,099 856 Updated Feb 16, 2026

deepseek-ai / DualPipe

A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.

Python 2,919 312 Updated Jan 14, 2026

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 6,178 818 Updated Feb 3, 2026

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,489 985 Updated Feb 6, 2026

verl-project / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 19,238 3,246 Updated Feb 16, 2026

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,965 288 Updated May 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tianqi Chen tqchen

Achievements

Achievements

Highlights

Organizations

Block or report tqchen

Stars

NVlabs / vibetensor

sgl-project / mini-sglang

Dao-AILab / flash-attention

NVIDIA / cutile-python

radixark / miles

perplexityai / pplx-garden

flashinfer-ai / flashinfer-bench

NVIDIA / jax-tvm-ffi

apache / tvm-ffi

meta-pytorch / BackendBench

pytorch / helion

flashinfer-ai / cubloaty

astral-sh / uv

msaroufim / pytorch-load-inline-highlighter

data-apis / array-api

agentsmd / agents.md

NVIDIA / TensorRT-LLM

pypa / cibuildwheel

scikit-build / scikit-build-core

NVIDIA / tilus

mshr-h / tvm-relax-cpp-example

Infini-AI-Lab / Multiverse

NVIDIA / jaxpp

ByteDance-Seed / Triton-distributed

ai-dynamo / dynamo

deepseek-ai / DualPipe

deepseek-ai / DeepGEMM

deepseek-ai / FlashMLA

verl-project / verl

deepseek-ai / open-infra-index