Open-source control plane for running, governing, and scaling self-hosted AI agent fleets.
AgentStack is an infrastructure control plane for teams that want to run AI agents on infrastructure they control: Kubernetes clusters, private clouds, bare-metal machines, local runners, and edge environments.
It is not another chatbot, agent builder, or business workflow product. AgentStack focuses on the shared operational layer needed by every serious agent runtime:
- agent lifecycle management;
- MCP server registration, health, and binding;
- skill package installation, approval, versioning, and rollout;
- model endpoint, token budget, and compute resource control;
- workspace, memory, secret, and sandbox isolation;
- fleet observability, auditability, quota, and cost tracking;
- self-hosted and enterprise deployment.
The short version:
AgentStack is Rancher for self-hosted AI agents.
This repository currently builds on a Kubernetes-native kagent foundation. During the transition to AgentStack, some commands, namespaces, API groups, package names, and Helm chart names still use kagent.
The agent ecosystem is moving quickly. Teams are adopting new runtimes, MCP servers, skills, tools, model endpoints, memory systems, and sandboxes at the same time.
That creates an infrastructure problem:
- How do you run 100 or 1,000 agent instances reliably?
- How do you bind agents only to approved MCP servers?
- How do you control which skills agents can install and execute?
- How do you allocate LLM, GPU, token, and concurrency budgets across teams?
- How do you isolate workspaces, memory, credentials, tools, and artifacts?
- How do you upgrade, roll back, and observe heterogeneous agent fleets?
- How do you operate agents inside private, air-gapped, or enterprise environments?
AgentStack provides a common control plane for these agent workloads.
AgentStack intentionally stays out of business-specific workflow and permission decisions.
It is not:
- a public internet chatbot;
- a general-purpose office assistant;
- a replacement for ChatGPT, Copilot, Doubao, WorkBuddy, or similar end-user products;
- an enterprise data governance system;
- a CRM, ERP, OA, ITSM, or approval workflow layer;
- a magical bypass for existing enterprise permissions.
AgentStack manages the infrastructure boundary:
- which agent runtime is allowed to run;
- which model endpoint it can use;
- which MCP servers it can access;
- which skills it can install;
- which workspace, memory backend, and sandbox profile it owns;
- how much token, compute, and concurrency budget it receives;
- where logs, traces, artifacts, approvals, and costs are recorded.
Business authorization remains the job of the systems that own the business data.
AgentStack is organized around three planes.
-
Control Plane API server, web UI, CLI, RBAC, audit, quota, policy, catalog, and fleet management.
-
Resource Plane Shared resources used by agents: model pools, MCP servers, skill packages, secrets, memory stores, workspaces, sandboxes, artifact stores, and observability backends.
-
Runtime Plane Agent instances running through Kubernetes, Docker, VM, bare-metal, local, or edge runners.
Current implementation shape:
+-------------+ +--------------+ +-------------+
| Controller | | HTTP Server | | UI |
| (Go) |-->| (Go) |-->| (Next.js) |
+-------------+ +--------------+ +-------------+
| |
v v
+-------------+ +--------------+
| Database | | Agent Runtime|
| (SQLite/PG) | | (Python) |
+-------------+ +--------------+
| |
v v
+-------------+ +--------------+
| Kubernetes | | MCP Servers |
| CRDs | | and Tools |
+-------------+ +--------------+
Current components include:
- a Go Kubernetes controller and HTTP API server;
- a Next.js management UI;
- a Python ADK-based agent runtime, plus framework adapters and samples;
- Kubernetes CRDs for agents, model configs, MCP servers, and tools;
- Helm charts for local and cluster deployment;
- SQLite/PostgreSQL-backed metadata and conversation state;
- built-in support for MCP tools, model providers, human-in-the-loop tool approval, memory, and observability hooks.
AgentStack is designed around the CRDs that already exist in the current control plane.
An Agent is the standard Kubernetes-native agent workload. It is used for agents that run inside the cluster and expose the A2A protocol.
Agents support two modes:
Declarativeagents, where the CRD owns the model, instructions, tools, and deployment shape.BYOagents, where the user provides a container image that serves A2A on port 8080.
Declarative agents currently support two built-in runtimes:
pythongo
The current tree includes ADK, LangGraph, CrewAI, and OpenAI-style samples. Those frameworks are packaged as agent code or images that run through the existing Agent deployment path.
AgentHarness is the long-lived harness environment for remote or sandboxed agent systems such as OpenClaw-style gateways. It is AgentStack's primary abstraction for harnessed runtimes.
The current AgentHarness API separates two dimensions:
spec.runtime: the control plane used to provision the harness VM. Current values areopenshellandsubstrate.spec.backend: the agent system that runs inside the harness. Current values areopenclaw,nemoclaw, andhermes.
For runtime: substrate, AgentStack generates a per-harness ActorTemplate, creates or resumes the actor through Substrate, and treats WorkerPool capacity as an external platform resource.
Example:
apiVersion: kagent.dev/v1alpha2
kind: AgentHarness
metadata:
name: research-harness
namespace: kagent
spec:
runtime: substrate
backend: openclaw
description: OpenClaw running on Agent Substrate
modelConfigRef: default-model-config
substrate:
workerPoolRef:
name: kagent-default
gatewayTokenSecretRef:
name: openclaw-gateway-token
channels:
- type: slack
slack:
botToken:
valueFrom:
type: Secret
name: slack-bot-token
key: token
appToken:
valueFrom:
type: Secret
name: slack-app-token
key: tokenThe current Kubernetes API also includes ModelConfig, RemoteMCPServer, ToolServer, and managed MCP resources. See docs/architecture/crds-and-types.md for the current CRD model and docs/substrate-agentharness-lifecycle.md for the Substrate ownership model.
.
+-- go/ # Go workspace: API types, controller, HTTP server, CLI, Go ADK
+-- python/ # Python runtime, ADK packages, framework integrations, samples
+-- ui/ # Next.js web interface
+-- helm/ # CRD and application Helm charts
+-- docs/ # Architecture and design documentation
+-- examples/ # Example integrations and workloads
+-- contrib/ # Community and extension contributions
+-- scripts/ # Local development and cluster helpers
Language boundaries are intentional:
- Go owns Kubernetes controllers, CLI, core APIs, HTTP server, and database infrastructure.
- Python owns agent runtime logic, ADK integrations, LLM providers, and framework adapters.
- TypeScript owns the web UI and browser-side API clients.
Prerequisites: Docker, Docker Buildx, Kind, kubectl, Helm, Go, and Make.
Create a local Kind cluster:
make create-kind-cluster
make use-kind-clusterChoose a model provider and set its key:
export KAGENT_DEFAULT_MODEL_PROVIDER=openAI
export OPENAI_API_KEY=your-openai-api-keyDeploy the stack:
make helm-installOpen the UI:
kubectl port-forward -n kagent svc/kagent-ui 3000:8080Then visit http://localhost:3000.
For more setup details, see DEVELOPMENT.md and helm/README.md.
Common commands:
make build
make -C go test
make -C go lint
make -C python lint
make -C go e2eAfter CRD changes:
make controller-manifestsAfter translator changes:
UPDATE_GOLDEN=true make -C go testArchitecture references:
- docs/architecture/README.md
- docs/architecture/README_CN.md
- docs/architecture/crds-and-types.md
- docs/architecture/controller-reconciliation.md
- docs/architecture/data-flow.md
- docs/architecture/human-in-the-loop.md
AgentStack is alpha-stage infrastructure. Near-term work focuses on:
- completing the naming transition from kagent to AgentStack;
- hardening the existing
AgentandAgentHarnessCRD model; - making AgentHarness runtime and backend adapters first-class extension points;
- strengthening MCP, skill, model, memory, workspace, and sandbox registries;
- adding clearer quota, cost, audit, and policy controls;
- improving fleet observability and operational workflows;
- hardening self-hosted, private, and enterprise deployment paths.
Contributions are welcome. Start with:
Use conventional commit messages such as:
feat: add runtime adapter registry
fix: preserve usage metadata in streaming responses
docs: rewrite AgentStack README
This project is licensed under the Apache 2.0 License.