-
Tencent
- Beijing
Stars
A tool for creating and running Linux containers using lightweight virtual machines on a Mac. It is written in Swift, and optimized for Apple silicon.
Build Virtual Machine Image from Dockerfile or Docker image
Hypercorn is an ASGI and WSGI Server based on Hyper libraries and inspired by Gunicorn.
NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer
High performance server-side application framework
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
slime is an LLM post-training framework for RL Scaling.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4
MSCCL++: A GPU-driven communication stack for scalable AI applications
A throughput-oriented high-performance serving framework for LLMs
Synchronization and asynchronous computation package for Go
[ICLR2025 Spotlight🔥] Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Borgo is a statically typed language that compiles to Go.
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
GLake: optimizing GPU memory management and IO transmission.
A fast inference library for running LLMs locally on modern consumer-class GPUs
Ring attention implementation with flash attention