Spark

Single-binary Go pod orchestrator for GPU hosts. Accepts Kubernetes manifests and runs workloads via Podman on a single node. Connects to NATS for messaging.

Features

Kubernetes manifest support -- Pod, Job, CronJob, Deployment, StatefulSet
NATS messaging -- apply, delete, get, list pods over request/reply subjects
HTTP REST API -- health checks, resource queries, and pod CRUD (v1.2.0)
Resource-aware scheduling -- CPU, memory, and GPU tracking with allocatable limits
Priority-based preemption -- lower-priority pods evicted to make room for higher-priority work
SQLite state persistence -- WAL-mode database for crash recovery (v1.1.0)
Pod re-discovery -- recovers running pods from Podman after restart (v1.1.0)
Retention pruning -- automatic cleanup of completed/failed pods (v1.1.0)
Resource reconciliation -- periodic sync of actual vs requested CPU/memory/GPU usage (v1.2.0)
Graceful shutdown -- configurable drain timeout with ordered teardown (v1.2.0)
GPU detection -- NVIDIA GPU support with unified memory fallback (GB10)
CronJob scheduling -- cron expression parsing with Allow/Forbid/Replace concurrency policies
Directory watcher -- manifest hot-loading with SHA-256 change detection
Heartbeat publishing -- periodic node status over NATS
Event and log streaming -- lifecycle events and container logs published over NATS
Prometheus metrics -- /metrics endpoint with node, pod, and scheduler metrics (v1.3.0)
HTTP authentication -- bearer token auth with configurable token file (v1.3.0)
Pod logs via HTTP -- tail and SSE streaming of pod logs (v1.3.0)
Pod events via HTTP -- lifecycle event history with time filtering (v1.3.0)
Structured JSON logging -- configurable log format for aggregation (v1.3.0)
EmptyDir volumes -- tmpfs-backed scratch volumes for pods (v1.3.0)
Pod exec -- execute commands inside running containers via HTTP (v1.4.0)
Container port mapping -- expose container ports to the host via podman --publish (v1.4.0)
Init containers -- sequential initialization containers before main containers (v1.4.0)
GPU device assignment -- per-pod GPU device isolation via NVIDIA_VISIBLE_DEVICES (v1.4.0)
Image management -- list and pull container images via HTTP API (v1.4.0)
Security context -- runAsUser, privileged, capabilities add/drop forwarded to podman (v1.5.0)
Manifest removal -- file deletion stops pods, releases resources, unregisters cron jobs (v1.5.0)
CronJob registration on all paths -- NATS, HTTP, and filesystem all register cron jobs (v1.5.0)
Stuck pod recovery -- Scheduled and Preempted pods recovered after timeout (v1.5.0)
GPU count-based scheduling -- nvidia.com/gpu: N allocates N device slots, separate from GPU memory (v1.6.0)
Liveness probes -- exec and HTTP probes with configurable thresholds; reconciler restarts on failure (v1.6.0)
CronJob HTTP management -- list, inspect, and unregister cron jobs via REST API (v1.6.0)
Node info endpoint -- GPU model, device count, device IDs, CPU, memory, OS via HTTP (v1.6.0)

Quick Start

go build ./cmd/spark
./spark --nats nats://localhost:4222

Spark will detect system resources (CPU, memory, GPU), connect to NATS, and begin watching /etc/spark/manifests for Kubernetes manifests.

CLI Flags

Flag	Default	Description
`--nats`	`nats://localhost:4222`	NATS server URL
`--node-id`	hostname	Node identifier
`--manifest-dir`	`/etc/spark/manifests`	Directory to watch for manifests
`--gpu-max`	`1`	Max concurrent GPU pods
`--heartbeat-interval`	`10s`	Heartbeat publish interval
`--reconcile-interval`	`5s`	Reconciliation loop interval
`--system-reserve-cpu`	`2000`	CPU millicores reserved for system
`--system-reserve-memory`	`4096`	MB of RAM reserved for system
`--state-db`	`/var/lib/spark/state.db`	Path to SQLite database file
`--pod-retention`	`168h`	Retention period for completed/failed pods
`--http-addr`	`:8080`	HTTP listen address
`--shutdown-timeout`	`30s`	Max time to drain pods on shutdown
`--reconcile-resources-interval`	`60s`	Resource reconciliation interval
`--log-format`	`text`	Log output format (text or json)
`--api-token-file`	(empty)	Path to file containing API bearer token
`--housekeeping-interval`	`1m`	Housekeeping loop interval
`--completed-pod-ttl`	`1h`	TTL after which Completed pods are reaped (0 disables)
`--failed-pod-ttl`	`24h`	TTL after which Failed pods are reaped (0 disables)
`--orphan-reap-ttl`	`1h`	TTL after which terminal-state orphan podman pods are reaped (0 disables)
`--image-prune-interval`	`24h`	Interval between `podman image prune -f` runs (0 disables)

Per-pod TTL override is available via the spark.feza.ai/ttl-after-finished annotation (any value parseable by time.ParseDuration; 0s disables cleanup for that pod).

HTTP API

All endpoints are served on the address specified by --http-addr (default :8080).

Health Check

curl http://localhost:8080/healthz

{"status": "ok"}

List Resources

curl http://localhost:8080/api/v1/resources

{
  "total": {"cpu_millis": 20000, "memory_mb": 131072, "gpu_memory_mb": 131072},
  "reserved": {"cpu_millis": 2000, "memory_mb": 4096, "gpu_memory_mb": 0},
  "allocated": {"cpu_millis": 4000, "memory_mb": 8192, "gpu_memory_mb": 65536},
  "available": {"cpu_millis": 14000, "memory_mb": 118784, "gpu_memory_mb": 65536}
}

List Pods

curl http://localhost:8080/api/v1/pods

[
  {"name": "myapp", "status": "Running", "created_at": "2026-03-19T10:00:00Z"}
]

Get Pod

curl http://localhost:8080/api/v1/pods/myapp

Returns the full pod record including spec, status, events, and timestamps.

Apply Pod

curl -X POST http://localhost:8080/api/v1/pods \
  -H "Content-Type: application/yaml" \
  -d @pod.yaml

Accepts a Kubernetes manifest (YAML) in the request body. Supports Pod, Job, CronJob, Deployment, and StatefulSet kinds.

Delete Pod

curl -X DELETE http://localhost:8080/api/v1/pods/myapp

{"deleted": "myapp"}

Prometheus Metrics

curl http://localhost:8080/metrics

Returns node and pod metrics in Prometheus text exposition format (v0.0.4).

Pod Logs

curl http://localhost:8080/api/v1/pods/myapp/logs?tail=50

Returns the last 50 lines of pod logs as text/plain. Use ?follow=true for SSE streaming.

Pod Events

curl http://localhost:8080/api/v1/pods/myapp/events

Returns pod lifecycle events as JSON. Use ?since=2026-03-19T00:00:00Z to filter by time.

Pod Exec

curl -X POST http://localhost:8080/api/v1/pods/myapp/exec \
  -H "Content-Type: application/json" \
  -d '{"command":["ls","-la"]}'

{"stdout":"total 0\ndrwxr-xr-x ...","stderr":"","exit_code":0}

List Images

curl http://localhost:8080/api/v1/images

Pull Image

curl -X POST http://localhost:8080/api/v1/images/pull \
  -H "Content-Type: application/json" \
  -d '{"image":"localhost:5000/mymodel:latest"}'

Node Info

curl http://localhost:8080/api/v1/node

{
  "hostname": "dgx-spark",
  "os": "linux",
  "arch": "arm64",
  "cpu_cores": 72,
  "memory_total_mb": 131072,
  "gpu_model": "NVIDIA GH200",
  "gpu_count": 1,
  "gpu_device_ids": [0],
  "gpu_memory_mb": 131072
}

List CronJobs

curl http://localhost:8080/api/v1/cronjobs

[
  {"name": "train-nightly", "schedule": "0 2 * * *", "next_run": "2026-03-21T02:00:00Z", "run_count": 14}
]

Get CronJob

curl http://localhost:8080/api/v1/cronjobs/train-nightly

Delete CronJob

curl -X DELETE http://localhost:8080/api/v1/cronjobs/train-nightly

Authentication

When --api-token-file is set, all HTTP endpoints except /healthz and /metrics require a bearer token:

curl -H "Authorization: Bearer <token>" http://localhost:8080/api/v1/pods

Without the header, requests return 401 Unauthorized. If --api-token-file is not set, authentication is disabled.

NATS Subjects

Subject	Purpose
`req.spark.apply`	Apply a pod manifest (request/reply)
`req.spark.delete`	Delete a pod (request/reply)
`req.spark.get`	Get pod status (request/reply)
`req.spark.list`	List all pods (request/reply)
`evt.spark.event.{pod}`	Pod lifecycle events
`log.spark.{pod}`	Container log streaming
`heartbeat.spark.{node}`	Node heartbeat with resource usage

Deployment

The deploy/ directory contains everything needed to run Spark on a DGX or similar GPU host:

File	Purpose
`setup-dgx.sh`	Full DGX setup: installs NATS, Spark, and systemd services
`setup-registry.sh`	Sets up a local OCI registry on port 5000
`spark.service`	Systemd unit for Spark
`nats-server.service`	Systemd unit for NATS
`registry.service`	Systemd unit for local OCI registry
`spark.env`	Environment variables for the Spark service
`install.sh`	Binary installation script
`nfpm/`	Deb packaging configuration

DGX Deployment

ssh ndungu@192.168.86.250
sudo bash deploy/setup-dgx.sh

This installs Spark and NATS as systemd services, configures the manifest directory, and starts both services.

Local OCI Registry

Spark uses a local OCI registry at localhost:5000 to store and serve container images, avoiding remote pulls during workload execution.

# Set up the registry
sudo bash deploy/setup-registry.sh

# Push an image
podman build -t myapp:latest .
podman tag myapp:latest localhost:5000/myapp:latest
podman push localhost:5000/myapp:latest

Reference images in pod manifests with the localhost:5000 prefix:

apiVersion: v1
kind: Pod
metadata:
  name: myapp
spec:
  containers:
    - name: myapp
      image: localhost:5000/myapp:latest

Architecture

cmd/spark/          Entry point: flags, startup, signal handling
internal/
  api/              HTTP REST API handlers (health, resources, node, pods, exec, logs, events, images, cronjobs, metrics, auth)
  bus/              NATS bus abstraction, protocol handlers, event/log publishers
  cron/             Cron expression parser and scheduled job trigger
  executor/         Podman interface: pod create, stop, exec, logs, image pull, stats, liveness probes
  gpu/              GPU detection (nvidia-smi), device enumeration, and system resource detection
  lifecycle/        Graceful shutdown coordinator with pod draining
  manifest/         K8s YAML parser (Pod, Job, CronJob, Deployment, StatefulSet) with ports, init containers, securityContext, livenessProbe
  metrics/          Prometheus metrics collector and text renderer
  reconciler/       Desired-state reconciliation loop, pod recovery, resource sync, liveness probe polling
  scheduler/        Resource-aware scheduling with priority preemption and GPU count-based device slot tracking
  state/            Pod state store (in-memory + SQLite WAL persistence) with source path tracking
  watcher/          Manifest directory poller (SHA-256 change detection)

Build / Test / Lint

# Build
go build ./cmd/spark

# Test
go test ./... -race -timeout 120s

# Lint
go vet ./...
staticcheck ./...

Constraints

Go standard library only, except github.com/nats-io/nats.go and modernc.org/sqlite.
Podman, not Docker.
Standard flag package for CLI flags.
HTTP routing via net/http.ServeMux with Go 1.22+ method-aware patterns.

Architecture Decisions

ADR	Title
001	Go standard library only
002	NATS protocol design
003	Local OCI registry
004	K8s manifest compatibility
005	Priority preemption algorithm
006	Resource-aware scheduling
007	Ubuntu deb packaging
008	SQLite state persistence
009	HTTP API design
010	Prometheus metrics via stdlib
011	HTTP bearer token authentication

License

MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 323 Commits
.github/workflows		.github/workflows
cmd/spark		cmd/spark
deploy		deploy
docs		docs
internal		internal
test		test
.claude-checkpoint.md		.claude-checkpoint.md
.gitignore		.gitignore
.goreleaser.yml		.goreleaser.yml
.release-please-manifest.json		.release-please-manifest.json
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
Containerfile		Containerfile
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum
release-please-config.json		release-please-config.json

Folders and files

Latest commit

History

Repository files navigation

Spark

Features

Quick Start

CLI Flags

HTTP API

Health Check

List Resources

List Pods

Get Pod

Apply Pod

Delete Pod

Prometheus Metrics

Pod Logs

Pod Events

Pod Exec

List Images

Pull Image

Node Info

List CronJobs

Get CronJob

Delete CronJob

Authentication

NATS Subjects

Deployment

DGX Deployment

Local OCI Registry

Architecture

Build / Test / Lint

Constraints

Architecture Decisions

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 13

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages