llm-d’s cover photo
llm-d

llm-d

Software Development

Open source project providing distributed inferencing for Generative AI runtimes on any Kubernetes cluster.

About us

llm-d is a new open source project focused on providing distributed inferencing for Generative AI runtimes on any Kubernetes cluster. Its architecture is designed for high performance and scalability, aiming to reduce costs through a spectrum of hardware and software efficiency improvements. llm-d prioritizes ease of deployment and use, as well as the operational needs of running large GPU clusters, including SRE concerns and day 2 operations. At launch, its key features include prefill/decode disaggregation, KV cache distribution and management, an AI-aware router with customizable scoring, operational telemetry, Kubernetes-based deployment, and the NIXL inference optimized transfer library.

Website
https://llm-d.ai/
Industry
Software Development
Company size
11-50 employees
Type
Nonprofit
Founded
2025

Employees at llm-d

Updates

Similar pages