Lists (1)
Sort Name ascending (A-Z)
Stars
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Post-training with Tinker
An open-source AI agent that brings the power of Gemini directly into your terminal.
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.
Minimalistic large language model 3D-parallelism training
Universal memory layer for AI Agents; Announcing OpenMemory MCP - local and secure memory management.
My learning notes/codes for ML SYS.
Scalable toolkit for efficient model reinforcement
Curated collection of papers in MoE model inference
Large Language Model (LLM) Systems Paper List
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
DeepEP: an efficient expert-parallel communication library
FlashMLA: Efficient Multi-head Latent Attention Kernels
verl: Volcano Engine Reinforcement Learning for LLMs
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
Official Repo for Open-Reasoner-Zero