Lists (1)
Sort Name ascending (A-Z)
Stars
A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM
Achieve state of the art inference performance with modern accelerators on Kubernetes
Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes
Low-overhead Kubernetes informer for sidecar controllers that don't need the full object
llm-d Router: The intelligent entry point for inference requests
The simplest, fastest repository for training/finetuning medium-sized GPTs.
A high-throughput and memory-efficient inference and serving engine for LLMs
System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge
A lightweight, configurable, and real-time simulator designed to mimic the behavior of vLLM without the need for GPUs or running actual heavy models.
Gateway API Inference Extension
Pull-through caching proxy with resumable downloads for OCI images
Collective communications library with various primitives for multi-machine training.
Red Hat Device Edge image construction
a script to run docker-compose.yml using podman
QMK TrackBall with 3 Switches (TB3S)
GitHub Action self-hosted runner images for OpenShift.
Low-level unprivileged sandboxing tool used by Flatpak and similar projects
Podman: A tool for managing OCI containers and pods.
Generic Control plane for creating Kubernetes like APIs