- Bay Area, CA
Highlights
- Pro
Stars
Test LLMs on real tasks. Compare models side-by-side.
Control panel for VLLM, Sglang, llama.cpp, exllamav3
Orca is the next-gen IDE for working with a fleet of parallel agents. Run any coding agent with your own subscription. Available on desktop and mobile
Desktop Companion for Hermes Agent
AtomicBot-ai / Atomic-Chat
Forked from janhq/janAtomic-Chat is an open source alternative to ChatGPT that runs 100% offline on your computer.
llama.cpp fork with TurboQuant WHT-rotated KV cache & weight compression + Gemma 4 MTP and Qwen 3.6 NextN speculative decoding (+30-50% throughput).
Browser Harness | Self-healing harness that enables LLMs to complete any task.
Mutation testing for Go source code. Fork from https://github.com/zimmski/go-mutesting
turbo-tan / llama.cpp-tq3
Forked from ggml-org/llama.cppllama.cpp fork with TQ3_1S/4S CUDA kernels β 3.5-bit WHT quantization achieving Q4s quality at 10% smaller size. Based on RaBitQ-inspired Walsh-Hadamard transform. Enables 27B models on 16GB GPUs wβ¦
llama.cpp fork with additional SOTA quants and improved performance
A knowledge graph for the notes you already have. Plain Markdown, git-native, fully local. Wire anything to anything with semantic predicates β supports, depends on, contradicts, relates to goal β β¦
The open-source security layer for AI agents. Deterministic guardrails, PII redaction, and EU AI Act compliance in one line of code.
This solution provides an automated, serverless way to redact sensitive data from PDF files using Google Cloud Services like Data Loss Prevention (DLP), Cloud Workflows, and Cloud Run.
Community recipes for serving LLMs on RTX 3090. Multi-engine (vLLM, llama.cpp, SGLang) and model-agnostic. Currently shipping Qwen3.6-27B configs for 1Γ and 2Γ cards.
Your First LLM-Wiki Conversation Knowledge Base
King of Spades: a cinematic playing-card dashboard theme for Hermes Agent.
A modern platform for visual, flexible, and extensible graph-based investigations. For cybersecurity analysts and investigators.
Terminal AI that reads code as code. AST surgery, LSP operations, a live dependency graph β not grep-and-paste.
Custom skins (visual themes) for the Hermes CLI agent
Penpot: The open-source design tool for design and code collaboration
A standalone BMAD module that transforms code repositories, documentation websites, and developer discourse into agentskills.io-compliant, version-pinned, provenance-backed agent skills.
llama.cpp fork with TurboQuant quantization (turbo2/3/4) and TriAttention GPU-accelerated KV cache pruning. 75 tok/s on Qwen3-8B / RTX 3080.
Hermes WebUI: The best way to use Hermes Agent from the web or from your phone!
The headless browser for AI agents and web scraping
Lucebox: LLM inference server built for speed for specific consumer hardware.
Supercharge Your LLM Application Evaluations π