-
AWS
- San Jose
-
10:15
(UTC -07:00)
Stars
Can AI Agents Build Bespoke LLM Serving Systems?
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
[MLSys 2026] AccelOpt: Self-improving Agents for AI Accelerator Kernel Optimization
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
torchax is a PyTorch frontend for JAX. It gives JAX the ability to author JAX programs using familiar PyTorch syntax. It also provides JAX-Pytorch interoperability, meaning, one can mix JAX & Pytor…
TPU inference for vLLM, with unified JAX and PyTorch support.
🚀 Efficient implementations for emerging model architectures
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
Muon is an optimizer for hidden layers in neural networks
A Datacenter Scale Distributed Inference Serving Framework
A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
SGLang is a high-performance serving framework for large language models and multimodal models.
Stop renting your intelligence. Own it with AnythingLLM. Everything you need for a powerful local-first agent experience
LLM training code for Databricks foundation models
NumPy and SciPy on Multi-Node Multi-GPU systems
Elegant easy-to-use neural networks + scientific computing in JAX. https://docs.kidger.site/equinox/
Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs