Tags: a1exus/sparky
Tags
v0.5.0 — llama-cpp router mode, float-tags policy Adds: - llama-cpp router mode (default): multi-model auto-discovery from the HF cache via symlink farm + auto-generated config.ini (managed-fields semantics). LLAMA_API_KEY bearer auth. Three model IDs per GGUF in /v1/models. Classic single-model mode still supported for one-off pinning. New helpers scripts/sync-router.sh + scripts/regen-config-ini.py. - tailscale/ stack: third ingress path alongside LAN (mDNS) and public (Cloudflare Tunnel), with per-backend VIP services + tailscale/Makefile. - Spec + plan docs for the router-mode rollout under docs/superpowers/. Changed: - Image-pin policy flipped: .env.example defaults float (latest / v2 / server-cuda); operators pin in host-local .env for reproducibility. - llama-cpp GPU exclusivity reclassified: router + Ollama coexist (both lazy); vLLM and classic mode are eagerly exclusive. - llama-cpp README: API + web UI on one port (8080), router quirks documented, "Pinning the image" snippet updated for ghcr.io's OCI image-index switch. - Default CTX_SIZE / VLLM_MAX_LEN bumped 8192 → 32768. Removed: - caddy/ stack (Traefik replaces it). Security: - LLAMA_API_KEY bearer auth on llama-cpp/. - Known: CVE-2026-33186 in cloudflared 2026.5.0 still upstream-tracked; with floating tags now the default, next `docker compose pull` picks up the rebuild once shipped. See CHANGELOG.md for the full per-bullet diff.
v0.4.0 — Traefik primary, Cloudflare Tunnel, polished tooling Adds: - traefik/ stack as the primary HTTPS reverse proxy (docker-label-driven, mints its own internal CA, optional LetsEncrypt scaffolding). - cloudflare/ stack — Cloudflare Tunnel connector for outbound-only public ingress, no inbound ports. - mdns/ ergonomic per-alias targets (`make add/remove/logs/resolve ALIAS=<name>`). - Trivy image scans for the two new images; .github/dependabot.yml for weekly grouped GitHub Action SHA bumps. - sparky.svg mascot — fresh redesign, speech-bubble face on warm cream. Changes: - caddy/ is now the backup proxy. Shared front-end Docker network renamed caddy → traefik, ownership moved into traefik/. - vllm/ + llama-cpp/: host-wide config in .env (image pin, HF cache, HF token, default knobs); per-variant values in envs/<name>.env. `make up ENV=<name>` chains both via --env-file. - llama-cpp/ pin re-locked to a multi-arch manifest-list digest of `server-cuda` (per-build tags are amd64-only, broke arm64). - traefik/ uses its own internal CA — Caddy is no longer a precondition. - Many polished bits: hf-cache annotates safetensors-only repos, hf-sync explains why they're skipped, top-level README rewritten, CI jobs have timeout-minutes, image pins refreshed. See CHANGELOG.md for the full set.
v0.3.0 — vLLM stack, tool-calling, env-per-variant workflow Adds: - vllm/ stack (vllm-openai v0.20.2, OpenAI tool-calling via qwen3_xml, Caddy-fronted at https://vllm.<domain>). Smoke-tested on GB10 with Qwen3.6-27B at 64K context. - One-env-per-model-variant layout for vllm/ and llama-cpp/ — `make up ENV=<name>` uses --env-file directly (no rolling .env). - `make hf-cache` / `make hf-sync` for env reconciliation against the host's HF / GGUF caches. .bak-based orphan path is non-destructive. - caddy/Makefile with `make ca-cert` to extract the internal root. Changes: - Deploy workflow: /opt is a git checkout on the host; `git pull` replaces the old scp + sudo install pattern. - caddy/ owns the shared `caddy` Docker network (defines it, attachable). Other stacks join as external. - open-webui Caddy vhost moved to open-webui.<domain> (matches the per-service subdomain convention). - open-webui persistent volumes (open-webui, open-webui-ollama) are external — never destroyed by `docker compose down -v`. See CHANGELOG.md for the full set.
v0.2.0 — llama.cpp, Ollama API exposure, mDNS subdomains, docs
Highlights:
- New llama-cpp/ stack with GPU on GB10; reuses Ollama + HF caches.
- Ollama API now reachable at ollama.${CADDY_DOMAIN}.
- mDNS subdomain aliases via mdns/Makefile.
- Caddyfile split into per-service files under Caddyfile.d/.
- Netdata basic-auth removed (LAN-trust posture).
- Per-stack READMEs; dedicated Trivy workflow doc.
See CHANGELOG.md for the full list.