Skip to content

feat: in-sandbox boost channel (HTTP-over-Unix-socket)#15

Merged
erans merged 17 commits into
mainfrom
feat/in-sandbox-boost-channel
Apr 27, 2026
Merged

feat: in-sandbox boost channel (HTTP-over-Unix-socket)#15
erans merged 17 commits into
mainfrom
feat/in-sandbox-boost-channel

Conversation

@erans
Copy link
Copy Markdown
Owner

@erans erans commented Apr 27, 2026

Summary

  • Adds a per-sandbox HTTP-over-Unix-socket channel inside running sandboxes so guest code can request boosts (and read self-state) without going through the operator-facing daemon API.
  • Each sandbox gets a dedicated host-side UDS exposed inside the guest at `/var/run/navaris-guest.sock`. Firecracker uses vsock + a guest-side proxy in `navaris-agent`; Incus uses an Incus-managed bind-mount via the `unix-socket` device type. Both produce the same property: each accepted connection is unambiguously the boost channel for one sandbox.
  • A shared `BoostHTTPHandler` routes `POST/GET/DELETE /boost` and `GET /sandbox` to the existing `BoostService` (spec feat: all-in-one Docker, web UI, terminal sessions, and Firecracker improvements #2). Guest requests are tagged `source: "in_sandbox"` on the event payloads.

Implementation per spec and plan. 15 commits, one per task.

Notable decisions

  • Implicit auth via channel binding: each accepted UDS connection is bound to a single sandbox identity by the transport (FC vsock listener / Incus per-sandbox UDS). No tokens.
  • Per-sandbox token-bucket rate limiting: 1 rps, burst 10 (`internal/api/ratelimit.go`). Flat per-conn for v1 — every accepted connection consumes one token regardless of method.
  • Source field on `EventBoostStarted`: spec §3.10 only requires it there, but we also emit `"source": "external"` on `EventBoostExpired`/`EventBoostRevertFailed` so consumers can rely on the field always being present (deliberate extension).
  • Restart recovery: FC walks in-memory vminfo; Incus walks the SandboxStore (state lives in incusd, not in our process). Stale Incus `navaris-boost` devices are removed before re-adding.
  • Provider/api decoupling: introduced exported `provider.BoostServer` interface so providers don't import `internal/api`. `*api.BoostHTTPHandler` satisfies it via duck-typing.
  • `SandboxID` on `CreateSandboxRequest`: threaded through the service layer so providers know the navaris-side identity (vs. `BackendRef` which is the FC vmID / Incus container name).

Daemon flags

  • `--boost-channel-enabled` (default `true`) — daemon-default for new sandboxes
  • `--boost-channel-dir` (default `/var/lib/navaris/boost-channels`) — host directory for per-sandbox Incus UDS files
  • Per-sandbox opt-in/out via `enable_boost_channel` on create-sandbox requests

Test plan

  • `go test ./...` (20 packages, all green)
  • `go test -tags incus ./...` (green)
  • `go test -tags firecracker ./...` (green)
  • `go test -tags 'incus firecracker' ./...` (green)
  • `go test -c -tags integration ./test/integration/` (compile-only, clean)
  • `go vet -tags integration ./test/integration/` (clean)
  • `npm run build` in `web/` (clean)
  • `/tmp/navarisd --help | grep boost-channel` shows both flags
  • FC integration smoke (requires real KVM env): `navaris sandbox exec ... -- curl --unix-socket /var/run/navaris-guest.sock -X POST http://_/boost -d '{"memory_limit_mb":384,"duration_seconds":30}'`

🤖 Generated with Claude Code

erans and others added 17 commits April 26, 2026 19:14
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…resolution

Add EnableBoostChannel *bool to CreateSandboxOpts (nil = use daemon default)
and defaultBoostChannel bool to SandboxService / NewSandboxService. Both Create
and CreateFromSnapshot resolve the option at create time and persist it onto
domain.Sandbox; handleCreate threads EnableBoostChannel and SandboxID through to
domain.CreateSandboxRequest so the provider can bind the boost socket. All
NewSandboxService call sites updated to pass false for now; Task 11 will swap
cmd/navarisd/main.go to cfg.boostChannelEnabled.
Default Source to "external" when empty; propagate to EventBoostStarted,
EventBoostExpired, and EventBoostRevertFailed payloads. Pass Source:
"external" explicitly from the operator HTTP handler. Tests cover both
explicit in_sandbox and default-external paths.
Implements BoostHTTPHandler which serves a minimal HTTP/1.1 API over
per-sandbox connections (net.Conn). Reads one request via http.ReadRequest,
dispatches to BoostService (POST/GET/DELETE /boost) or SandboxStore
(GET /sandbox), writes one response, closes the conn. Rate-limits per sandbox
via the token-bucket RateLimiter from Task 6 with flat per-conn accounting.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add a per-VM unix-socket listener that accepts guest-initiated vsock
connections on <vmDir>/vsock_1025 (or <vmDir>/root/vsock_1025 with
jailer) and dispatches each connection to the BoostHTTPHandler via the
boostServer interface. The interface decouples the firecracker package
from internal/api to avoid cyclic imports.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…er wiring

Add --boost-channel-enabled (default true) and --boost-channel-dir flags to
navarisd; swap the hardcoded false in NewSandboxService to cfg.boostChannelEnabled;
pass cfg.boostChannelDir into incus.Config.BoostChannelDir.

Introduce provider.BoostServer exported interface in internal/provider/boost.go so
both the firecracker and incus providers can expose SetBoostHandler(provider.BoostServer)
without importing internal/api (which would violate layering). After boostSvc and
BoostHTTPHandler are constructed, main.go walks the builtProviders slice and calls
SetBoostHandler via a local boostHandlerSetter interface assertion.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add RestartBoostListeners (FC) and RestartBoostChannel (Incus) to replay
boost listeners for surviving VMs/containers after a daemon restart.
Wire both replay calls in main.go after SetBoostHandler is wired, fixing
the silent no-op that would occur if called inside recover().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…k 502 not file presence

The agent's RunBoostProxy always creates /var/run/navaris-guest.sock inside
the guest regardless of host-side opt-out, so the file-existence assertion
was wrong. With EnableBoostChannel=false the host-side vsock_1025 listener
isn't created, the proxy's vsock.Dial fails, and the proxy returns 502 —
that's the right behavioral indicator.
Previous commit edited the wrong Dockerfile — Dockerfile.navarisd is for
Incus paths; FC compose uses Dockerfile.navarisd-firecracker. Adding curl
to both the alpine and debian rootfs builds so the boost-channel tests
can POST via curl --unix-socket from inside the guest.
@erans erans merged commit dbb9f27 into main Apr 27, 2026
9 checks passed
@erans erans deleted the feat/in-sandbox-boost-channel branch April 27, 2026 05:15
erans added a commit that referenced this pull request Apr 27, 2026
Implementation merged in #15. The spec was committed locally before the
worktree was created (commit 8c3fe98), but never pushed; the plan was
authored as an untracked file. GitHub's squash-merge treated 8c3fe98 as
part of the merge base and excluded it from the squash, so neither doc
landed on main. Adding both retroactively to keep the docs/specs and
docs/plans directories complete for future reference.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant