Skip to content
@llm-d-incubation

llm-d incubation

Incubating components of llm-d, a Kubernetes-native high-performance distributed LLM inference framework

Popular repositories Loading

  1. llm-d-infra llm-d-infra Public

    llm-d helm charts and deployment examples

    Shell 42 43

  2. workload-variant-autoscaler workload-variant-autoscaler Public

    Variant optimization autoscaler for distributed inference workloads

    Go 18 14

  3. llm-d-modelservice llm-d-modelservice Public

    helm charts for deploying models with llm-d

    Smarty 18 31

  4. llm-d-fast-model-actuation llm-d-fast-model-actuation Public

    Go 3 6

  5. llm-d-ci llm-d-ci Public

    Shell 2 2

  6. ig-wva ig-wva Public

    Workload Variant Autoscaler is a service to compute the cost-optimal provisioning of heterogeneous accelerators for inference workloads with varying request latency objectives

    Jupyter Notebook 1 1

Repositories

Showing 7 of 7 repositories
  • workload-variant-autoscaler Public

    Variant optimization autoscaler for distributed inference workloads

    llm-d-incubation/workload-variant-autoscaler’s past year of commit activity
    Go 18 Apache-2.0 14 47 8 Updated Oct 9, 2025
  • llm-d-incubation/llm-d-fast-model-actuation’s past year of commit activity
    Go 3 Apache-2.0 6 22 9 Updated Oct 9, 2025
  • llm-d-modelservice Public

    helm charts for deploying models with llm-d

    llm-d-incubation/llm-d-modelservice’s past year of commit activity
    Smarty 18 31 3 (1 issue needs help) 4 Updated Oct 8, 2025
  • hermes Public

    Hermes is a cluster configuration scanning and self-test generation tool for llm-d inference workloads

    llm-d-incubation/hermes’s past year of commit activity
    Rust 0 0 0 1 Updated Oct 7, 2025
  • llm-d-infra Public

    llm-d helm charts and deployment examples

    llm-d-incubation/llm-d-infra’s past year of commit activity
    Shell 42 Apache-2.0 43 13 16 Updated Oct 2, 2025
  • llm-d-ci Public
    llm-d-incubation/llm-d-ci’s past year of commit activity
    Shell 2 2 0 0 Updated Aug 6, 2025
  • ig-wva Public

    Workload Variant Autoscaler is a service to compute the cost-optimal provisioning of heterogeneous accelerators for inference workloads with varying request latency objectives

    llm-d-incubation/ig-wva’s past year of commit activity
    Jupyter Notebook 1 1 0 1 Updated Jul 11, 2025

Top languages

Loading…

Most used topics

Loading…