Skip to content
View yurayli's full-sized avatar

Block or report yurayli

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

High performance ML

16 repositories

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 34,350 3,309 Updated Dec 18, 2025

Flax is a neural network library for JAX that is designed for flexibility.

Jupyter Notebook 6,982 769 Updated Dec 16, 2025

Fast and memory-efficient exact attention

Python 21,154 2,228 Updated Dec 16, 2025

Accessible large language models via k-bit quantization for PyTorch.

Python 7,828 800 Updated Dec 12, 2025

Code repository for the paper - "Matryoshka Representation Learning"

Jupyter Notebook 586 36 Updated Feb 19, 2024

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 41,022 4,668 Updated Dec 17, 2025

Implementation for MatMul-free LM.

Python 3,040 199 Updated Dec 2, 2025

Efficient Triton Kernels for LLM Training

Python 5,954 450 Updated Dec 17, 2025

Development repository for the Triton language and compiler

MLIR 17,866 2,454 Updated Dec 18, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 65,649 12,035 Updated Dec 18, 2025

Large Language Model Text Generation Inference

Python 10,709 1,246 Updated Dec 11, 2025

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 12,416 1,963 Updated Dec 17, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 21,597 3,787 Updated Dec 18, 2025

Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.

Go 157,816 13,944 Updated Dec 17, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 64,116 7,771 Updated Dec 16, 2025

On-the-fly conversions between Jax and NumPy tensors

Python 57 9 Updated Mar 17, 2023