Stars
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
slime is an LLM post-training framework for RL Scaling.
Production-ready platform for agentic workflow development.
Accelerating MoE with IO and Tile-aware Optimizations
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
基于多智能体LLM的中文金融交易框架 - TradingAgents中文增强版
An extremely fast Python package and project manager, written in Rust.
Ongoing research training transformer models at scale
The official Python SDK for Model Context Protocol servers and clients
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
SGLang is a high-performance serving framework for large language models and multimodal models.
A high-throughput and memory-efficient inference and serving engine for LLMs
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
A Data Streaming Library for Efficient Neural Network Training
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.