Skip to content
View vMaroon's full-sized avatar

Organizations

@IBM @stolostron @neuralmagic @llm-d

Block or report vMaroon

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

World's first fully integrated and fully Automated Kubernetes management and orchestration solution

TypeScript 72 58 Updated Apr 27, 2026

Let my Claude talk to yours.

Go 24 7 Updated Apr 27, 2026

General agent evaluation framework

Python 51 10 Updated Apr 27, 2026

A personal PR-review extension.

Python 1 Updated Mar 9, 2026

Standardized Serverless ML Inference Platform on Kubernetes

Go 3 20 Updated Apr 27, 2026

A framework for efficient model inference with omni-modality models

Python 4,521 844 Updated Apr 27, 2026

Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond

Python 895 105 Updated Apr 26, 2026

Span Queries: What if we had a way to plan and optimize GenAI like we do for SQL?

Rust 13 8 Updated Apr 27, 2026

Main Kagenti repo - installer, UI and docs

Python 190 76 Updated Apr 27, 2026

GenAI inference performance benchmarking tool

Python 178 87 Updated Apr 27, 2026

Incubating P/D sidecar for llm-d

Go 17 29 Updated Nov 13, 2025

llm-d benchmark scripts and tooling

Python 58 71 Updated Apr 27, 2026

A lightweight, configurable, and real-time simulator designed to mimic the behavior of vLLM without the need for GPUs or running actual heavy models.

Go 121 81 Updated Apr 27, 2026

Helm charts for llm-d

Shell 52 57 Updated Jul 22, 2025

Inference scheduler for llm-d

Go 175 176 Updated Apr 27, 2026

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 3,088 440 Updated Apr 27, 2026

Distributed KV cache scheduling & offloading libraries

Go 138 116 Updated Apr 27, 2026
Python 106 26 Updated Jul 21, 2025

Gateway API Inference Extension

Go 657 282 Updated Apr 27, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 78,330 16,162 Updated Apr 27, 2026

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,838 1,038 Updated Mar 30, 2026

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Python 2,307 394 Updated Apr 27, 2026
Go 1 Updated Jan 28, 2025

LangChain for Go, the easiest way to write LLM-based programs in Go

Go 9,157 1,093 Updated Jan 11, 2026

GUI tool for visualizing the result data of deBruijn sequence complexity distribution study

C++ 2 Updated Feb 20, 2024

KubeStellar - a flexible solution for multi-cluster configuration management for edge, multi-cloud, and hybrid cloud

Go 660 266 Updated Apr 16, 2026

the main repository for the multicluster global hub

Go 22 36 Updated Apr 26, 2026