-
sglang Public
Forked from sgl-project/sglangSGLang is a fast serving framework for large language models and vision language models.
Python Apache License 2.0 UpdatedOct 9, 2025 -
dynamo Public
Forked from ai-dynamo/dynamoA Datacenter Scale Distributed Inference Serving Framework
Rust Apache License 2.0 UpdatedSep 23, 2025 -
tensorzero Public
Forked from tensorzero/tensorzeroTensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.
Rust Apache License 2.0 UpdatedAug 21, 2025 -
genai-bench Public
Forked from sgl-project/genai-benchGenai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.
Python MIT License UpdatedAug 13, 2025 -
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedAug 12, 2025 -
ome Public
Forked from sgl-project/omeOME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
Go MIT License UpdatedAug 9, 2025 -
nano-vllm Public
Forked from GeeeekExplorer/nano-vllmNano vLLM
Python MIT License UpdatedJun 27, 2025 -
KAI-Scheduler Public
Forked from NVIDIA/KAI-SchedulerKAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale
Go Apache License 2.0 UpdatedMay 29, 2025 -
production-stack Public
Forked from vllm-project/production-stackvLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
Python Apache License 2.0 UpdatedApr 17, 2025