Skip to content

Tags: a1exus/sparky

Tags

v0.5.0

Toggle v0.5.0's commit message
v0.5.0 — llama-cpp router mode, float-tags policy

Adds:
- llama-cpp router mode (default): multi-model auto-discovery from the
  HF cache via symlink farm + auto-generated config.ini (managed-fields
  semantics). LLAMA_API_KEY bearer auth. Three model IDs per GGUF in
  /v1/models. Classic single-model mode still supported for one-off
  pinning. New helpers scripts/sync-router.sh + scripts/regen-config-ini.py.
- tailscale/ stack: third ingress path alongside LAN (mDNS) and public
  (Cloudflare Tunnel), with per-backend VIP services + tailscale/Makefile.
- Spec + plan docs for the router-mode rollout under docs/superpowers/.

Changed:
- Image-pin policy flipped: .env.example defaults float (latest / v2 /
  server-cuda); operators pin in host-local .env for reproducibility.
- llama-cpp GPU exclusivity reclassified: router + Ollama coexist
  (both lazy); vLLM and classic mode are eagerly exclusive.
- llama-cpp README: API + web UI on one port (8080), router quirks
  documented, "Pinning the image" snippet updated for ghcr.io's OCI
  image-index switch.
- Default CTX_SIZE / VLLM_MAX_LEN bumped 8192 → 32768.

Removed:
- caddy/ stack (Traefik replaces it).

Security:
- LLAMA_API_KEY bearer auth on llama-cpp/.
- Known: CVE-2026-33186 in cloudflared 2026.5.0 still upstream-tracked;
  with floating tags now the default, next `docker compose pull` picks
  up the rebuild once shipped.

See CHANGELOG.md for the full per-bullet diff.

v0.4.0

Toggle v0.4.0's commit message
v0.4.0 — Traefik primary, Cloudflare Tunnel, polished tooling

Adds:
- traefik/ stack as the primary HTTPS reverse proxy (docker-label-driven,
  mints its own internal CA, optional LetsEncrypt scaffolding).
- cloudflare/ stack — Cloudflare Tunnel connector for outbound-only
  public ingress, no inbound ports.
- mdns/ ergonomic per-alias targets (`make add/remove/logs/resolve
  ALIAS=<name>`).
- Trivy image scans for the two new images; .github/dependabot.yml for
  weekly grouped GitHub Action SHA bumps.
- sparky.svg mascot — fresh redesign, speech-bubble face on warm cream.

Changes:
- caddy/ is now the backup proxy. Shared front-end Docker network
  renamed caddy → traefik, ownership moved into traefik/.
- vllm/ + llama-cpp/: host-wide config in .env (image pin, HF cache,
  HF token, default knobs); per-variant values in envs/<name>.env.
  `make up ENV=<name>` chains both via --env-file.
- llama-cpp/ pin re-locked to a multi-arch manifest-list digest of
  `server-cuda` (per-build tags are amd64-only, broke arm64).
- traefik/ uses its own internal CA — Caddy is no longer a precondition.
- Many polished bits: hf-cache annotates safetensors-only repos,
  hf-sync explains why they're skipped, top-level README rewritten,
  CI jobs have timeout-minutes, image pins refreshed.

See CHANGELOG.md for the full set.

v0.3.0

Toggle v0.3.0's commit message
v0.3.0 — vLLM stack, tool-calling, env-per-variant workflow

Adds:
- vllm/ stack (vllm-openai v0.20.2, OpenAI tool-calling via qwen3_xml,
  Caddy-fronted at https://vllm.<domain>). Smoke-tested on GB10 with
  Qwen3.6-27B at 64K context.
- One-env-per-model-variant layout for vllm/ and llama-cpp/ —
  `make up ENV=<name>` uses --env-file directly (no rolling .env).
- `make hf-cache` / `make hf-sync` for env reconciliation against the
  host's HF / GGUF caches. .bak-based orphan path is non-destructive.
- caddy/Makefile with `make ca-cert` to extract the internal root.

Changes:
- Deploy workflow: /opt is a git checkout on the host; `git pull`
  replaces the old scp + sudo install pattern.
- caddy/ owns the shared `caddy` Docker network (defines it,
  attachable). Other stacks join as external.
- open-webui Caddy vhost moved to open-webui.<domain> (matches the
  per-service subdomain convention).
- open-webui persistent volumes (open-webui, open-webui-ollama) are
  external — never destroyed by `docker compose down -v`.

See CHANGELOG.md for the full set.

v0.2.0

Toggle v0.2.0's commit message
v0.2.0 — llama.cpp, Ollama API exposure, mDNS subdomains, docs

Highlights:
- New llama-cpp/ stack with GPU on GB10; reuses Ollama + HF caches.
- Ollama API now reachable at ollama.${CADDY_DOMAIN}.
- mDNS subdomain aliases via mdns/Makefile.
- Caddyfile split into per-service files under Caddyfile.d/.
- Netdata basic-auth removed (LAN-trust posture).
- Per-stack READMEs; dedicated Trivy workflow doc.

See CHANGELOG.md for the full list.

v0.1.0

Toggle v0.1.0's commit message
v0.1.0 — initial sparky release

First release: open-webui + ollama, caddy reverse proxy with
internal CA, netdata observability, mdns subdomain aliases,
Trivy CI security scanning.

See CHANGELOG.md for full details.