Tags: askalf/dario
Tags
fix(tools): only advertise CC tools the client declared (4.8.75) (#526) * fix(tools): only advertise CC tools the client declared (release 4.8.75) Default (template) mode replaced the client's tool array with the FULL CC_TOOL_DEFINITIONS (31 tools incl. AskUserQuestion/Agent/Task*/Workflow). A CC session with a tool disabled (headless/SDK context where AskUserQuestion isn't enabled) sends its array without it; dario re-added it; the model emitted a tool_use the client harness couldn't run -> 'AskUserQuestion exists but is not enabled in this context'. Default mode now advertises only the CC-native tools the client actually declared, using CC's canonical definitions for them. A real reduced-tool CC client sends exactly this array, so it tracks CC's wire shape rather than diverging. Full-tool clients are unaffected; if the client declares no CC-native tool at all the full template is kept. preserve/merge unchanged. Regression test: 7 cases; full suite 90/90. * test: make tool-advertise test platform-agnostic Original asserted exact counts using Glob/Grep, which are win32-only (PLATFORM_ONLY_TOOLS) — passed on Windows, failed on Linux CI where they're filtered out of CC_TOOL_DEFINITIONS. Switch to cross-platform tools (Read/Bash/Edit/Write/WebSearch) and assert the platform-agnostic SUBSET invariant (no advertised tool was undeclared) as the primary check.
fix(health): don't leak OAuth internals on the public /health (releas… …e 4.8.74) (#525) dario's /health is auth-free (docker healthchecks need it pre-secret). Behind a Cloudflare tunnel with a public /health bypass for uptime monitoring, it was world-readable and returned the oauth status, access-token expiresIn countdown, request volume, and refresh-error detail. Public requests are identified by the CF-edge-stamped cf-ray header and now get only {status: ok|degraded}; internal loopback callers (docker healthcheck, dario doctor, the self-probe — all hit 127.0.0.1:3456 with no CF headers) still get full detail. HTTP 200/503 is unchanged, so external uptime checks keying on the status code are unaffected. Logic extracted to buildHealthResponse with a 19-case unit test.
fix(cc-template): CC-native identity-map overrides TOOL_MAP (release … …4.8.73) (#524) Completes 4.8.72. That release stopped CC's MISSING tools from being round-robined, but core tools like Read were still corrupted: Read IS in TOOL_MAP as the lowercase cross-client alias 'read', whose translateBack emits {path, filePath} and drops file_path. A CC client's 'Read' lowercased to 'read', hit that alias, and every Read still failed validation client-side. Fix: match CC's own tools by EXACT name (CC_NATIVE_NAMES, PascalCase) and identity-map them AHEAD of TOOL_MAP. Exact case is the discriminator, so a genuine non-CC 'read' (lowercase/snake) still routes via TOOL_MAP unchanged. Tests: issue-29 now asserts Read.translateBack PRESERVES file_path + a regression guard that lowercase 'read' still translates. Suites green (issue-29 54, schema-contract 127, template-invariants 228, strict 16, detection 63, hybrid 60, platform 41 — 0 fail).
release: 4.8.71 — temp-disable suspended model families (Fable 5 / My… …thos 5) (#518) Filters Fable 5 / Mythos 5 (globally suspended 2026-06-12) out of the catalog + rejects them with a clean 404. Reversible via DARIO_SUSPENDED_MODELS. Full suite 88/88.
feat(doctor): per-family client-system obedience probe (--obedience) … …+ doctor-watch wiring (v4.8.67) (#511) Closes #509. The 2026-06-12 sonnet regression (client system prompts silently ignored; fixed by #508 precedence framing) was invisible to every existing check - 200s, subscription billing, label-clean templates, model smoke all green - and detection came from a downstream consumer failing to parse JSON. - dario doctor --obedience: probe each model family THROUGH the running proxy (exercises the cc-template merge seam) with a trivial client system prompt ("reply with ONLY the word PONG"), up to 3 attempts per family, families in parallel. fail = answered-but-ignored (behavioral drift; runbook: investigate presentation/merge, not necessarily a dario bug). warn = probe could not complete (infra flake, not drift). - check-doctor-drift.mjs runs doctor --obedience and treats obedience FAIL rows as drift, so the 6-hourly on-box dario-doctor-watch files the dedup'd issue the morning behavior shifts. A deployed dario predating the flag emits no rows - the watch upgrade ships safely ahead of the container image. - e2e battery: 4-family obedience section (live-validated 16/16). - Pure verdict helpers (isObedientReply, extractMessageText) unit-tested in test/doctor-obedience.mjs (28 asserts); suite 88/88. Live-validated on a local proxy with prod OAuth: all 4 families PONG on attempt 1/3; watch parse contract verified against real doctor output.
PreviousNext