-
EveryMatrix
Stars
Local-first MCP server that gives coding agents structured context packets, code/schema facts, and diagnostics - backed by a local SQLite store.
Real-time network diagnostics in your terminal. One command, zero config, instant visibility.
high-performance linear attention kernel library built on TileLang
Ansible playbooks for a 3-node K3s cluster with NVIDIA DGX Spark nodes for distributed LLM inference
ComfyUI-QwenVL custom node: Integrates the Qwen-VL series, including Qwen2.5-VL and the latest Qwen3-VL, with GGUF support for advanced multimodal AI in text generation, image understanding, and vi…
stardust7700 / ComfyUI
Forked from Comfy-Org/ComfyUIThe most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Cardamon is a cleanup tool for Prometheus that collects unused metrics from Grafana and Prometheus and generates drop statements for them.
A lightweight schema-on-read analytics in a single binary
The only tool you need to know what is happening and how to fix it.
Single-file web UI for NVIDIA DGX Spark — pull Ollama models, browse and download from HuggingFace, manage LiteLLM routing, and control SGLang, vLLM, llama.cpp, LocalAI, and ComfyUI. All from one b…
A coding agent optimized to smaller LLMs
Export helm stats into the Prometheus format
Linux-native "fake root" for implementing rootless containers
Self-hosted, semantically-connected personal knowledge base
Lightweight OpenTelemetry viewer for local development. View traces, logs & metrics instantly.
Tool-calling quality benchmark for LLM serving stacks. 65+ deterministic scenarios testing multi-turn orchestration, safety boundaries, and structured output. Supports vLLM, LiteLLM, and llama.cpp.
Port Kill helps you find and free ports and caches blocking your dev work.
The Helm chart you can use to install any of your applications into Kubernetes/OpenShift
Turn any NVIDIA GPU into a local AI platform. Inference + fine-tuning in your browser. One command to start, automatic clustering.
High-performance uncensored Gemma-4-26B inference on NVIDIA DGX Spark using vLLM - 45+ tok/s
The Prometheus monitoring system and time series database.
nono - a capability-based, multiplexing sandbox tool, built for developers - lift'n'shift seamless path to prod. Run agents securely without needing any additional infra, zero setup, zero latency.
A terminal-based system monitor (TUI) optimized for NVIDIA Grace Blackwell (GB10) and hybrid CPU architectures.
Check your Kubernetes changes before they hit the cluster