Stars
red-hat-data-services / kserve
Forked from opendatahub-io/kserveStandardized Serverless ML Inference Platform on Kubernetes
A framework for efficient model inference with omni-modality models
Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
Span Queries: What if we had a way to plan and optimize GenAI like we do for SQL?
GenAI inference performance benchmarking tool
A lightweight, configurable, and real-time simulator designed to mimic the behavior of vLLM without the need for GPUs or running actual heavy models.
Achieve state of the art inference performance with modern accelerators on Kubernetes
Distributed KV cache scheduling & offloading libraries
Gateway API Inference Extension
A high-throughput and memory-efficient inference and serving engine for LLMs
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
LangChain for Go, the easiest way to write LLM-based programs in Go
GUI tool for visualizing the result data of deBruijn sequence complexity distribution study
KubeStellar - a flexible solution for multi-cluster configuration management for edge, multi-cloud, and hybrid cloud
the main repository for the multicluster global hub