Skip to content
View tqchen's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Organizations

@apache @dmlc @uwsampl @octoml

Block or report tqchen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Fast and memory-efficient classical machine learning operators

Python 519 39 Updated Jun 18, 2026

Compact and Agent-Native MoE Training System

Python 201 16 Updated Jun 18, 2026

A kernel library written in tilelang

Python 1,596 140 Updated Apr 23, 2026

The open-source agent-serving project

Python 477 32 Updated Jun 8, 2026
Rust 7 Updated Mar 9, 2026

Our first fully AI generated deep learning system

Python 630 48 Updated Feb 2, 2026

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,426 704 Updated May 17, 2026

Fast and memory-efficient exact attention

Python 24,182 2,842 Updated Jun 19, 2026

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

Python 2,080 140 Updated Jun 17, 2026

Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.

Python 1,586 267 Updated Jun 19, 2026

Perplexity open source garden for inference technology

Rust 581 56 Updated May 27, 2026

Building the Virtuous Cycle for AI-driven LLM Systems

Python 250 41 Updated May 1, 2026

JAX support for tvm-ffi abi

Python 26 5 Updated May 14, 2026

Open ABI and FFI for Machine Learning Systems

C++ 416 81 Updated Jun 19, 2026

Ship correct and fast LLM kernels to PyTorch

Python 151 17 Updated Jan 14, 2026

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 888 152 Updated Jun 19, 2026

a size profiler for cuda binary

Python 70 Updated Jan 15, 2026

An extremely fast Python package and project manager, written in Rust.

Rust 86,547 3,219 Updated Jun 19, 2026

VS Code extension for syntax highlighting C++/CUDA/HIP code in PyTorch load_inline() strings

Python 9 Updated Jul 25, 2025

RFC document, tooling and other content related to the array API standard

Python 268 54 Updated Apr 23, 2026

AGENTS.md — a simple, open format for guiding coding agents

TypeScript 22,327 1,642 Updated Mar 12, 2026

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 13,910 2,480 Updated Jun 19, 2026

🎡 Build Python wheels for all the platforms with minimal configuration.

Python 2,241 319 Updated Jun 18, 2026

A next generation Python CMake adaptor and Python API for plugins

Python 474 89 Updated Jun 18, 2026

Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.

Python 487 26 Updated Jun 11, 2026

Minimum example for deploying Apache TVM's Relax IR using C++ API

C++ 6 1 Updated Nov 29, 2025
Python 119 10 Updated Sep 13, 2025

JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training

Python 79 2 Updated Jun 18, 2026

Distributed Compiler based on Triton for Parallel Systems

Python 1,462 151 Updated Apr 22, 2026

A Datacenter Scale Distributed Inference Serving Framework

Rust 7,295 1,259 Updated Jun 19, 2026
Next