Sync the conversational epic: headless serve, persona Slack, honest usage, lifecycle client#215
Sync the conversational epic: headless serve, persona Slack, honest usage, lifecycle client#215prasadus92 wants to merge 12 commits into
Conversation
Remove the server-rendered dashboard from alfred serve: drop the Jinja templates, static assets (logo, bundled fonts, stylesheet), the formatting helpers, and the HTML/planning routes plus the HTML-form token path in views.py. Per ADR 0002 the native client is the only GUI and the CLI covers everything else; mutating POSTs still require the per-launch X-Alfred-Token header. Drop the removed --no-browser flag and the browser auto-open from the serve runtime, and wire the file-issue spec-memory-candidate with an already_filed idempotency guard so duplicate clicks never pollute the review queue. Update test_server.py to the headless surface.
Carry real per-result token counts on ClaudeResult (tokens_in, tokens_out, cache_creation_tokens, cache_read_tokens) parsed from the result envelope usage block. SpendState.record_result persists per-day token counters plus an engine_breakdown rollup while keeping cost_usd_today for back-compat, and metrics.agent_metric sums the new keys. server/usage.py surfaces Codex quota from the rate_limits windows, dedupes subagent token replays via the cumulative session total, and resets delta totals on a day rollover. Add the shared agent profile model (agent_profile / agent_label / AgentProfile / AgentTheme) with public-safe defaults and env overrides, and render a Display . Role label in the server reader and schedule views. Thread claude_permission_mode through claude_invoke, claude_invoke_streaming, and invoke_agent_engine including the auth-retry recursions, so a caller can enforce a read-only Claude plan mode.
Add a converse intent for read-only questions, summaries, and banter. Tier-1 answers from prompt context; the listener escalates once to a richer read-only tier-2 turn with a 60s budget and a bounded turn budget, enforced read-only for real (Claude plan mode, Codex read-only sandbox). Add a status_facet hint, a persona voice layer (butler default, gilfoyle preset) injected outside the untrusted sentinel, a persisted ConversationContext, and the resolve_min_confidence public accessor honored by the listener. Render the Display . Role profile label across Slack posts and control output. Default CLAUDE_CONFIG_DIR in agent-launch so a reboot cannot strand firings on a stale keychain token, and add a doctor sentinel to the memory-harvest wrapper.
Reorganize the native client around the lifecycle: Home, Ask, Pipeline, Fleet, and Lessons. PipelineView replaces the Kanban board with hold/done and a follow-up convert action; LifecycleCard carries the work-card grammar; chips.ts derives the status/repo/age chips; the Ask composer is a unified conversational input (Enter sends, Shift-Enter newline). Add a glass and density pass, fleet-status role labels, and the Codex quota panel. The client reads and mutates entirely through the headless alfred serve API. Drops the removed App.css, KanbanBoard, and PlansView.
Add docs/DESIGN_SPEC.md (canonical native-client product design) and docs/ARCHITECTURE_RFC.md (code-organization RFC: target package shape, config-driven generic runner, typed event envelope and fail-closed gates, refactoring roadmap), plus ADR 0002 under decisions/ for the one-runtime-four-doors surface decision and client design direction. Update README, docs/INDEX, NATIVE_CLIENT, DESKTOP_CLIENT, SERVE, INSTALL_TIERS, and the site concept/reference pages for the headless serve API and the lifecycle client tab set, dropping the stale --no-browser flag and server-rendered dashboard references. Roll the epic into the CHANGELOG [Next] section per Keep a Changelog.
Line-wrapping only, no logic change: match the repo formatter on the ported slack_intent, slack_listener, and their tests so the CI ruff format --check step passes.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 00c404b8fc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if legacy_role and not explicit_role_title and not is_known_profile: | ||
| role_title = legacy_role |
There was a problem hiding this comment.
Preserve legacy role overrides for known agents
When a standard codename like lucius or drake has ALFRED_<CODENAME>_ROLE set by the existing launchd agents.conf role column, this branch ignores that value because is_known_profile is true, so Slack/CLI labels now show the built-in role instead of the operator-configured role. Since launchd/render.sh still emits the legacy ALFRED_<CODENAME>_ROLE variable and not ROLE_TITLE, existing fleets lose their configured role labels unless they adopt a new env var manually.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in e91be6c. agent_profile no longer gates the legacy ALFRED__ROLE behind not is_known_profile. The role-title precedence is now ROLE_TITLE > legacy ROLE > built-in default for every codename, so a standard agent like lucius or drake whose role is configured in agents.conf (and rendered to ALFRED_ROLE by launchd/render.sh) keeps that operator-configured label instead of falling back to the built-in role. Added a test that asserts a known profile shows the built-in default with no env, the legacy _ROLE when set, and an explicit ROLE_TITLE over both.
| action="store_true", | ||
| help="do not auto-open a browser tab", | ||
| ) | ||
| p.add_argument( |
There was a problem hiding this comment.
Keep
--no-browser compatible with the CLI wrapper
Removing this parser option breaks the still-supported wrapper path: bin/alfred continues to register alfred serve --no-browser and forwards --no-browser to this script, so that command now exits with argparse error code 2 (unrecognized arguments: --no-browser). Since serve is headless now, this can be a no-op, but it needs to remain accepted or the wrapper/help should be changed in the same commit.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in e91be6c. alfred-serve.py re-registers --no-browser as an accepted no-op (help suppressed, since serve is headless and never opens a browser), so bin/alfred serve --no-browser keeps forwarding the flag without the script exiting on argparse error 2. Added a parser test that asserts --no-browser parses (args.no_browser True), defaults off when omitted, and the other options still parse.
Resolve the three PR 215 review findings: - views.py: add _planning_draft_path, which validates the operator-supplied draft id and confirms the resolved path stays inside planning-drafts before reading. Use it at both read sites so the CodeQL path-expression alert clears and a validation gap cannot escape the directory. - metadata.py: let the legacy ALFRED_<CODENAME>_ROLE env override the built-in default role for known profiles too. render.sh still emits that variable from agents.conf, so configured fleets kept losing their role labels. Precedence is ROLE_TITLE > legacy _ROLE > built-in default. - alfred-serve.py: re-register --no-browser as an accepted no-op so the still supported bin/alfred serve --no-browser wrapper no longer exits with argparse error 2. serve is headless, so the flag does nothing and is hidden from help. Add tests for each: traversal rejection on the draft path, legacy-role precedence on a known profile, and --no-browser parser acceptance.
|
@codex review Addressed all three findings. The planning-draft reads now route through planning_draft_path, which validates the operator-supplied id and confirms the resolved path stays inside planning-drafts before reading, clearing the CodeQL path-expression alert. agent_profile now lets the legacy ALFRED_ROLE env override the built-in default role for known profiles too (precedence ROLE_TITLE > legacy _ROLE > default), so fleets configured via agents.conf keep their role labels. alfred-serve.py re-registers --no-browser as an accepted headless no-op so the bin/alfred serve --no-browser wrapper no longer hits argparse error 2. Tests added for each (traversal rejection, legacy-role precedence, --no-browser acceptance); full Python suite (1540 passing), desktop vitest (190) and tsc all green, scrub-check clean. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e91be6ccd9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| draft_id = _safe_planning_draft_id(plan_id) | ||
| if draft_id is None: | ||
| path = _planning_draft_path(Path(state_root), plan_id) |
There was a problem hiding this comment.
Restore draft_id before returning issue filing results
When a valid planning draft reaches /api/plans/{plan_id}/file-issue, this replacement drops the draft_id local but the success, already-filed, and non-created return dictionaries below still reference draft_id. That raises NameError, so the route returns a 500 for normal filing attempts, potentially after SlackIssueBridge.convert() has already created the GitHub issue and before the draft is marked filed.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 6f571f8. _file_planning_draft_issue now recovers the sanitized id with draft_id = path.stem right after _planning_draft_path resolves and validates the path, so the success, already-filed, and non-created return dicts reference a defined local again. That clears the NameError that turned every normal filing into a 500 (and could fire after SlackIssueBridge.convert had already created the issue). It also resolves the ruff F821 undefined-name failures on the same lines. OSS suite green (full tests/ pass under TZ=UTC), ruff and mypy clean.
…nitizer
The path-safety refactor dropped the draft_id local from
_file_planning_draft_issue but left the success, already-filed, and
non-created return dicts referencing it. That raised a NameError on every
normal filing (a 500 from /api/plans/{plan_id}/file-issue, possibly after
SlackIssueBridge.convert had already created the issue) and tripped ruff
F821. Recover the sanitized id from the validated filename via path.stem
right after the path resolves, so the responses carry the id again.
Also normalize the validated id with os.path.basename before joining it
onto the planning-drafts base in _planning_draft_path. For an
already-allowlisted id this is a no-op, but it gives CodeQL a recognized
path-injection barrier on top of the existing allowlist regex and the
resolved-parent containment check, clearing the remaining
py/path-injection alert.
OSS suite green (1000+ tests pass under TZ=UTC), ruff and mypy clean,
cleanliness scan clean.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6f571f85e5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| role = fields[6].strip() if len(fields) >= 7 else "" | ||
| codename = _codename_from_label(label) | ||
| profile_fields = agent_profile(codename).to_dict() | ||
| role_title = profile_fields["role_title"] or role |
There was a problem hiding this comment.
Prefer agents.conf role titles over defaults
When agents.conf has a custom role for a built-in codename, agent_profile(codename) already supplies a non-empty default such as Lucius · Senior Developer, so this expression never falls back to the role parsed from the file. The existing /api/schedule tests still assert the raw role, but the desktop UI now reads role_title, so operators who customized column 7 in agents.conf will see the built-in role instead of their configured one. Use the parsed role before the profile default here.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 97c1c03. _parse_row now sets role_title = role or profile default, so an operator's customized agents.conf column-7 role wins over the built-in profile default (which agent_profile always supplies non-empty for known codenames). Added a role_title assertion to test_api_schedule_returns_cron_and_interval_runs confirming the configured 'Single-repo engineer' surfaces instead of 'Lucius · Senior Developer'.
| state = _agent_state_label(agent) | ||
| fired = _last_fired_label(agent) | ||
| lines.append(f"- `{name}` — {state}{fired}") | ||
| lines.append(f"- {name} — {state}{fired}") |
There was a problem hiding this comment.
Replace em dashes in Slack prose
The root AGENTS.md explicitly says “No em-dashes in prose or comments”; this added Slack status string introduces an em dash separator, and the new recent-runs line below does the same. Please use a comma, colon, or plain hyphen separator instead so the changed operator-facing prose follows the repository rule.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 97c1c03. The two added Slack lines (the agent-list status and the recent-runs line) now use a colon separator instead of an em dash, matching the repo's no-em-dash rule for operator-facing prose. The substring assertions in test_slack_control.py still hold.
… containment - slack_control.py: the two added Slack status lines (agent list and recent runs) used an em dash separator, which the repo rule forbids in prose. Use a colon separator instead. - schedule.py: _parse_row built role_title as profile default OR parsed role, but agent_profile supplies a non-empty default for known codenames, so a custom agents.conf column-7 role never surfaced in the desktop role_title. Prefer the parsed role and fall back to the profile default. Adds a role_title assertion to the /api/schedule test. - views.py: harden _planning_draft_path so CodeQL recognizes the path containment. Resolve the candidate with os.path.realpath and confirm via os.path.commonpath that it sits directly inside the resolved planning-drafts directory, on top of the existing allowlist regex and basename reduction. Existence stays a caller concern, so the 400-vs-404 split is preserved. Ruff and mypy clean, full suite green under TZ=UTC, cleanliness scan clean.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 97c1c032be
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # tools only: a conversational answer never mutates fleet or GitHub state. | ||
| DEFAULT_ESCALATE_TIMEOUT = 60 | ||
| DEFAULT_ESCALATE_TURNS = 3 | ||
| DEFAULT_ESCALATE_TOOLS = "Bash(gh *) Read Grep" |
There was a problem hiding this comment.
Pass Claude tools as a valid allowlist
In the default tier-2 Slack converse path, this value is passed through invoke_agent_engine to claude --allowedTools as one argv element. I checked Anthropic's Claude Code CLI reference, which describes --allowedTools as a list and shows separate entries such as "Bash(git log:*)" "Read" (https://docs.claude.com/en/docs/claude-code/cli-reference); the rest of this repo also uses comma-separated strings like Read,Bash,Grep. With the current space-delimited string, Claude sees one invalid/unknown allowlist entry, so escalation runs without the intended gh, Read, and Grep permissions and status questions that need repo/GitHub context can fail or fall back unnecessarily.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in def618d. DEFAULT_ESCALATE_TOOLS is now comma-separated ("Bash(gh *),Read,Grep"), matching every other claude_allowed_tools caller and what agent_runner forwards to --allowedTools. The space-separated form was passed as a single argv and mis-parsed. Full suite green under TZ=UTC, ruff and mypy clean.
| ) | ||
| self.save() | ||
|
|
||
| def record_result(self, result: Any, *, engine: str | None = None) -> None: |
There was a problem hiding this comment.
Wire token recording into real firings
This helper records the new token counters, but it is not called by any production firing path. I checked the repo-wide call sites, and outside tests record_result has no callers while agents still record engine usage via spend.increment(...), so tokens_in_today, tokens_out_today, cache_tokens_today, and engine_breakdown stay at their default zero values for real runs even when ClaudeResult contains usage. That makes the new spend-ledger token reporting silently inaccurate until the existing result-recording call sites use this method.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Acknowledged. record_result is a backward-compatible superset of the legacy increment(firings_today=...) call, but wiring it into every runner firing path (replacing the direct increment calls across the runners so the new token counters populate in production) is a cross-cutting change to the live spend ledger that is larger than this conversational-epic PR. Flagging it to land as a separate, focused change rather than expanding this PR's scope and risking the spend-recording path. The server-side usage reader already covers per-session token reporting in the meantime.
The realpath+commonpath guard cleared one of the two path-injection alerts, but _planning_draft_path still rejoined the operator filename onto base in its return, so the second read site (the planning-memory proposal) kept the taint and stayed flagged. Return the realpath-resolved, commonpath-verified candidate directly. Behavior is unchanged: a valid id still yields its path whether or not the file exists (callers keep the 404-vs-unsafe-id split), traversal and separator ids still return None, and the stem still recovers the sanitized id.
|
@codex review |
Mirror the internal font directive onto the OSS desktop client. Instrument Sans carries the UI with headings by weight and tracking; Fragment Mono is reserved for transcripts, ids, and code. Space Grotesk, Quicksand, and JetBrains Mono are removed as default-AI-product fonts. Swaps the @fontsource imports and the --font-heading/--font-sans/--font-mono token block, and updates the lockfile. Desktop suite green: vitest 190/190, tsc --noEmit clean, vite build resolves the new font imports. Cleanliness scan clean.
|
Added in bdbb75b: applied the operator font directive to the OSS desktop client, mirroring the internal change. Swapped the @fontsource imports and the --font-heading/--font-sans/--font-mono token block to Instrument Sans Variable and Fragment Mono, removing Space Grotesk, Quicksand, and JetBrains Mono, and updated the lockfile. Desktop suite green (vitest 190/190, tsc clean, vite build resolves the new imports) and the cleanliness scan is clean. @codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bdbb75bfa3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # tools only: a conversational answer never mutates fleet or GitHub state. | ||
| DEFAULT_ESCALATE_TIMEOUT = 60 | ||
| DEFAULT_ESCALATE_TURNS = 3 | ||
| DEFAULT_ESCALATE_TOOLS = "Bash(gh *) Read Grep" |
There was a problem hiding this comment.
Use a valid allowed-tools list for escalation
When Slack converse falls back to the default tier-2 Claude path, this string is forwarded as the single --allowedTools argument. The runner documents and all other callers use comma-separated tool specs such as Read,Grep,Glob, so this space-separated value is treated as one malformed tool entry instead of enabling Read, Grep, and the gh bash pattern. In default hybrid mode that makes the promised read-only tool escalation fail or run without the tools it needs, so questions that require repo or GitHub context often degrade to the empty fallback instead of answering.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in def618d (same root cause as the earlier note on this line). DEFAULT_ESCALATE_TOOLS is now "Bash(gh *),Read,Grep", comma-separated to match the runner's --allowedTools convention.
DEFAULT_ESCALATE_TOOLS was space-separated ("Bash(gh *) Read Grep"), but
it is forwarded as a single --allowedTools argv element, and every other
claude_allowed_tools caller plus agent_runner's own docs use comma-separated
specs. The space form is mis-parsed, so the tier-2 Slack converse fallback
did not get the intended read-only tool gate. Use "Bash(gh *),Read,Grep".
|
@codex review |
|
Codex Review: Didn't find any major issues. Keep them coming! Reviewed commit: ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Brings alfred-os to functional parity with the native conversation control epic, scrubbed for public release. One branch, logically grouped commits (serve, usage, slack, client, docs).
Headless serve
alfred serveis now a headless localhost JSON/SSE API. Removed the server-rendered dashboard (Jinja templates, static assets,formatting.py, the HTML//planningroutes, and the HTML-form token path), dropped the--no-browserflag and the browser auto-open, and wired the file-issue spec-memory-candidate with analready_filedidempotency guard. Mutating POSTs still require the per-launchX-Alfred-Tokenheader.Conversational Slack
Added a
converseintent (read-only questions, summaries, banter) with a tier-2 escalation enforced read-only for real: Claude--permission-mode planand a Codexread-onlysandbox, threaded through both invokers including the auth-retry recursions. Added a persona layer (butler default, gilfoyle preset, injected outside the untrusted sentinel), a persistedConversationContext, astatus_facetsteer, and theresolve_min_confidencepublic accessor honored by the listener.Honest usage
ClaudeResultcarries real token fields from the result envelope;SpendState.record_resultpersists per-day token counters plus anengine_breakdownrollup andmetrics.agent_metricsums them.server/usage.pysurfaces Codex quota from therate_limitswindows, dedupes subagent token replays via the cumulative session total, and resets delta totals on a day rollover. Added the shared agent profile model (agent_profile/agent_label) so every codename renders aDisplay . Rolelabel.Fleet reliability
agent-launchdefaultsCLAUDE_CONFIG_DIRso a reboot cannot strand firings on a stale keychain token. Added a doctor sentinel to the memory-harvest wrapper, and made the preflight Slack throttle tests hermetic by redirectingSTATE_ROOT.Native client
Reorganized the desktop client to the lifecycle IA (Home, Ask, Pipeline, Fleet, Lessons).
PipelineViewreplaces the Kanban board with hold/done and a follow-up convert action;LifecycleCardandchips.tscarry the work-card grammar; the Ask composer is a unified conversational input. Glass and density pass, fleet-status role labels, and the Codex quota panel. The client reads and mutates entirely through the headless API.Docs
Added
docs/DESIGN_SPEC.md,docs/ARCHITECTURE_RFC.md, and ADR 0002 underdecisions/. Updated README,docs/INDEX, NATIVE_CLIENT, DESKTOP_CLIENT, SERVE, INSTALL_TIERS, and the site concept/reference pages for the headless serve API and the lifecycle tab set. Rolled the epic into the CHANGELOG[Next]section per Keep a Changelog (no release tagged).Validation
tsc --noEmitclean, 190 vitest tests passed.bin/scrub-check.sh: clean.