Skip to content

fix(hub,cli): four hub-restart-cascade cleanup bugs (#913 #914 #916 #919)#923

Open
heavygee wants to merge 5 commits into
tiann:mainfrom
heavygee:fix/hub-restart-cleanup-bundle
Open

fix(hub,cli): four hub-restart-cascade cleanup bugs (#913 #914 #916 #919)#923
heavygee wants to merge 5 commits into
tiann:mainfrom
heavygee:fix/hub-restart-cleanup-bundle

Conversation

@heavygee

Copy link
Copy Markdown
Collaborator

Summary

Four contained bugs uncovered by the 2026-06-15 hub-restart cascade incident
where systemctl restart hapi-hub.service SIGTERMed 23 cursor ACP sessions.
Each fix is small and lands independently of the architectural #915
(cascade-archive root cause) and the hypothesis-pending #917. They produce
audit-trail correctness and idempotency wins that stand on their own.

Refs #913, #914, #916, #919.

#913 - cursor ACP fresh-session cursorSessionId persistence race

Fresh ACP sessions could be SIGTERMed mid-flight in the async
update-metadata ACK round-trip, stranding the on-disk ACP store with no DB
handle. Distinct from #834 (which fixed the resume path via
pre-registration before session/load).

Fix: add ApiSessionClient.flushMetadata() (drains the metadata lock,
5s timeout) and await it after onSessionFoundWithProtocol on the
fresh-session branch in cursorAcpRemoteLauncher.ts. The resume-path
pre-registration shape stays intact. A 5s no-ACK timeout logs a warning so
the operator has a diagnostic if hub backpressure ever drops the write.

#914 - default archiveReason polluted audit trail with "User terminated"

runnerLifecycle.ts defaulted archiveReason = 'User terminated'. SIGTERM
from a hub-restart cascade hit the same cleanup path that the web-UI
Archive click did, both recording "User terminated" — but only one was
intentional.

Fix:

  • Default in runnerLifecycle.ts flipped to 'Hub restart'.
  • registerKillSessionHandler (the hub-driven kill-RPC, sent only on
    web-UI Archive) now calls setArchiveReason('User terminated') before
    cleanupAndExit.
  • SIGINT (local-terminal Ctrl-C) keeps 'User terminated'.
  • All call sites (runClaude, runCodex, runCursor, runGemini,
    runKimi, runOpencode) updated to pass the lifecycle object instead of
    the bare closure. Legacy closure shape still accepted for
    back-compat (runAgentSession.ts).

#916 - POST /api/sessions/:id/archive returned 500 when CLI is gone

Archive route called engine.archiveSessionrpcGateway.killSession
sessionRpc, which threw a generic Error when no target socket was
registered. The unhandled throw became 500. Operator clicking Archive on
split-brain zombies (after a cascade) saw nothing but a toast.

Fix:

  • New typed RpcTargetMissingError (code: handler-not-registered or
    socket-disconnected) thrown from rpcGateway.rpcCall.
  • syncEngine.archiveSession narrows on it: still flips lifecycleState
    to archived via a new sessionCache.markSessionArchivedFromHub helper
    (writes archivedBy='hub', archiveReason='Archived from hub (CLI unreachable)'),
    then handleSessionEnd marks inactive in cache.
  • Route drops requireActive: true and short-circuits 2xx with
    { ok: true, alreadyArchived: true } for already-archived rows.
  • Non-RPC errors (DB write failure, etc.) still propagate as 5xx.

#919 - sessionCache: three metadata writers threw on version-mismatch without refresh

renameSession, clearSessionArchiveMetadata, and
restoreSessionArchiveMetadata followed a one-shot-throw pattern on
version-mismatch. The cache stayed stale, so the caller's retry hit the
same mismatch → forever-409 on PATCH /sessions/:id,
POST /sessions/:id/reopen, and the reopen rollback path. The codebase
already had the good pattern: sessionCache.mergeSessions retries with a
fresh snapshot.

Fix: rewrite all three helpers as for (attempt = 0; attempt < 5; …)
retry-with-refresh loops. After 5 attempts of genuine concurrent contention
they still throw the original 409 message, so the existing operator-facing
contract is preserved.

§6 gates

  • bun typecheck — passes on the branch against upstream/main@93d00414.
  • bun run test — 982/983 tests pass. The single failure is the
    pre-existing runner.integration.test.ts > should detect version mismatch and kill old runner flake, confirmed to fail on upstream/main before the
    branch too (expected '0.0.0-integration-test-...' to be '0.20.2'
    version-substitution issue unrelated to this diff).
  • New unit tests:
  • Operator-side soup verification: branch merged into local
    driver/integration; 529/529 hub tests + 409/409 relevant cli tests
    pass on the integrated soup build.

Coordination notes

AI disclosure

Per CONTRIBUTING.md: implementation by Claude Sonnet 4.5 (Cursor IDE
agent, auto-model) acting as a feature peer in a multi-agent HAPI fork
workflow. Issue triage and codepath verification were done by a sibling
discovery agent (c8ac5e86). Operator supervised; tests + cold review +
soup verification gates passed before raising the PR. The code follows the
operator's friction-mode disposition: smallest defensible change per bug.

Made with Cursor

 tiann#916 tiann#919)

These four contained bugs were uncovered by a 2026-06-15 hub-restart
incident where `hapi-restart-hub` SIGTERMed 23 cursor ACP sessions.
Each fix lands independently of the architectural tiann#915 (hub-restart
cascade-archive) and the hypothesis-pending tiann#917 (reopen creates dead
session); audit-trail correctness and idempotency wins stand on their
own.

tiann#913 - cursor ACP fresh-session: cursorSessionId persistence race.
  Fresh ACP sessions could be SIGTERMed during the async `update-metadata`
  ACK round-trip, stranding the on-disk ACP store with no DB handle. Add
  `ApiSessionClient.flushMetadata()` and await it after `onSessionFoundWithProtocol`
  on the fresh-session branch. Resume-path pre-registration (PR tiann#834) is
  unchanged.

tiann#914 - default archiveReason 'User terminated' polluted audit trail.
  Hub-restart-cascade SIGTERMs went through the same path as web-UI
  Archive clicks, both writing archiveReason='User terminated'. New
  default is 'Hub restart'; the KillSession RPC handler (the
  authoritative user-archive signal) now explicitly stamps
  'User terminated' before cleanupAndExit. SIGINT (local-terminal Ctrl-C)
  keeps the 'User terminated' label too.

tiann#916 - POST /api/sessions/:id/archive returned 500 when CLI is gone.
  `rpcGateway.killSession` threw a generic Error when no target socket
  was registered, and the archive route surfaced that as 500. Add typed
  `RpcTargetMissingError`, narrow on it in `syncEngine.archiveSession`,
  fall back to a hub-side `markSessionArchivedFromHub` write so
  lifecycleState still flips to 'archived'. Drop the requireActive
  guard on the route and 2xx-noop for already-archived rows.

tiann#919 - sessionCache: three metadata writers threw on version-mismatch
  without refresh, producing forever-409 on rename/reopen until an
  unrelated event triggered a cache refresh. `renameSession`,
  `clearSessionArchiveMetadata`, `restoreSessionArchiveMetadata` now
  retry-with-refresh (5 attempts, then throw) mirroring the existing
  good pattern in `mergeSessions`.

Refs tiann#913
Refs tiann#914
Refs tiann#916
Refs tiann#919

AI disclosure: implementation by Claude Sonnet 4.5 (Cursor agent peer)
under operator supervision. Issue triage by a sibling discovery agent.
Per CONTRIBUTING.md AI-assisted contributions policy.

Co-authored-by: Cursor <cursoragent@cursor.com>

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Runner-driven stop paths now get mislabeled as hub restarts — the new default makes every SIGTERM archive as Hub restart unless a caller stamps a different reason first. Web Archive now does that through registerKillSessionHandler, but hapi runner stop-session, runner webhook timeout cleanup, and orphan cleanup terminate child sessions with SIGTERM directly (cli/src/runner/run.ts:267, cli/src/runner/run.ts:587). Those are operator/runner actions, not hub restarts, yet the changed SIGTERM handler keeps the new default at cli/src/agent/runnerLifecycle.ts:122, so archived metadata becomes misleading and the audit-trail fix regresses another supported termination path.
    Suggested fix:
    // propagate an explicit reason from runner stop/cleanup paths, e.g. via env on spawn
    const archiveReason = process.env.HAPI_ARCHIVE_REASON
    if (archiveReason) {
        lifecycle.setArchiveReason(archiveReason)
    }

Summary

  • Review mode: initial
  • Found one audit-trail regression introduced by the SIGTERM default change. Remaining risk: I did not run the full test suite in this automation pass.

Testing

  • Not run (automation)

HAPI Bot

// production. If a future code path needs to distinguish "operator
// killed the host process" from "hub restart", it can call
// setArchiveReason() before the runner exits.
process.on('SIGTERM', () => {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Major] Runner-driven stop paths now get mislabeled as hub restarts. The new default makes every SIGTERM archive as Hub restart unless a caller stamps a different reason first. Web Archive now does that through registerKillSessionHandler, but hapi runner stop-session, runner webhook timeout cleanup, and orphan cleanup terminate child sessions with SIGTERM directly (cli/src/runner/run.ts:267, cli/src/runner/run.ts:587). Those are operator/runner actions, not hub restarts, yet this SIGTERM handler keeps the new default, so archived metadata becomes misleading and the audit-trail fix regresses another supported termination path.

Suggested fix:

// propagate an explicit reason from runner stop/cleanup paths, e.g. via env on spawn
const archiveReason = process.env.HAPI_ARCHIVE_REASON
if (archiveReason) {
    lifecycle.setArchiveReason(archiveReason)
}

…archive reason

Addresses bot review of tiann#923: with the tiann#914 default-archiveReason flip to
'Hub restart', runner-driven SIGTERM paths (`hapi runner stop-session`,
webhook-timeout cleanup at run.ts:587, orphan-cleanup at run.ts:267) all
mislabel as 'Hub restart' which is also inaccurate audit-trail noise.

Smallest defensible change: parameterise the lifecycle default via
HAPI_DEFAULT_ARCHIVE_REASON env, and have the runner set
'Stopped by runner' on spawn. Terminal-launched sessions (no runner
parent, no env var) still default to 'Hub restart' since hub-restart
cascade documented at tiann#915 is the most plausible SIGTERM source for
those. Explicit overrides via setArchiveReason (KillSession RPC, SIGINT
Ctrl-C, markCrash uncaught exception) still win.

Two new unit tests cover the env-var default and the override
precedence.

Refs tiann#914.

Co-authored-by: Cursor <cursoragent@cursor.com>
@heavygee

Copy link
Copy Markdown
Collaborator Author

Bot review (Codex PR Review run 27567766623) caught a real Major: with the #914 default flipped to 'Hub restart', three runner-driven SIGTERM paths get mislabeled.

Verified active paths (hub never calls RPC_METHODS.StopSession, that surface is dead):

  • hapi runner stop-session <id> (controlServer.ts -> run.ts stopSession -> killProcessByChildProcess)
  • webhook-timeout cleanup at cli/src/runner/run.ts:587
  • orphan-cleanup at cli/src/runner/run.ts:267

Pushed 1c8972a3 with the smallest defensible fix matching the bot's suggested shape: parameterise the lifecycle default via HAPI_DEFAULT_ARCHIVE_REASON env, runner sets 'Stopped by runner' on spawn. Terminal-launched sessions (no runner parent) keep 'Hub restart' as the default since #915 cascade is the most plausible SIGTERM source there. Explicit overrides (KillSession RPC, SIGINT Ctrl-C, markCrash) still win. Two new unit tests cover env-default and override precedence.

Out of scope for this PR (flagging for maintainer): hapi runner stop-session is currently a bare SIGTERM with no pre-RPC, so it lands as 'Stopped by runner' instead of 'User terminated'. Sending a kill-RPC before SIGTERM would be a follow-up; not changing it here to keep the diff small.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Missing-CLI archive can report success even when the hub archive metadata write fails — markSessionArchivedFromHub returns silently on updateSessionMetadata errors or exhausted version retries, and archiveSession then still marks the in-memory session ended and lets /archive return { ok: true }. That regresses the stated #916 contract that non-RPC failures still surface as 5xx; after a DB/write failure the operator sees success while the persisted row can remain unarchived/running. Evidence hub/src/sync/sessionCache.ts:532, caller continues at hub/src/sync/syncEngine.ts:445 and hub/src/sync/syncEngine.ts:450.
    Suggested fix:
    markSessionArchivedFromHub(sessionId: string, reason: string): void {
        // ...
        if (result.result === 'error') {
            throw new Error('Failed to archive session metadata from hub')
        }
        // ...
        throw new Error('Session was modified concurrently while archiving from hub')
    }

Summary

  • Review mode: follow-up after new commits
  • Found one archive idempotency/data-persistence issue. Remaining risk: I did not run the full test suite in this automation pass.

Testing

  • Not run (automation)

HAPI Bot


if (result.result === 'error') {
throw new Error('Failed to update session metadata')
if (result.result === 'error') {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Major] Missing-CLI archive can report success even when the hub archive metadata write fails. This branch catches RpcTargetMissingError, calls markSessionArchivedFromHub, then always continues to handleSessionEnd, but this helper returns silently when updateSessionMetadata returns error and also returns silently after exhausting version retries. The route will still return { ok: true }, leaving the persisted lifecycle row unarchived while the operator sees success; that also contradicts the PR's intended behavior that non-RPC errors (DB write failure, etc.) still propagate as 5xx.

Suggested fix:

markSessionArchivedFromHub(sessionId: string, reason: string): void {
    // ...
    if (result.result === 'error') {
        throw new Error('Failed to archive session metadata from hub')
    }
    // ...
    throw new Error('Session was modified concurrently while archiving from hub')
}

… 5xx

Addresses second-round bot review of tiann#923 (Major): `markSessionArchivedFromHub`
silently returned on DB write errors and on exhausted version-retry
attempts, which would let `/archive` claim 200 OK while the row stayed
unarchived. That regresses the tiann#916 acceptance criterion that non-RPC
errors during archive must still propagate as 5xx.

Both fall-through paths now throw, matching the contract of the
sibling writers in this file (renameSession, mergeSessions). The
sessionModel test suite gains two cases that spy on
`store.sessions.updateSessionMetadata` to force `error` and
`version-mismatch` shapes and asserts the helper throws. The existing
route test at `hub/src/web/routes/sessions.test.ts:1015` already
covers the route-level 500 propagation for any error thrown out of
`archiveSession`, so no new route test is needed.

Imports `spyOn` from `bun:test` to match this test file's runtime
(the rest of the hub package uses bun:test, not vitest).

Refs tiann#916.

Co-authored-by: Cursor <cursoragent@cursor.com>
@heavygee

Copy link
Copy Markdown
Collaborator Author

Bot follow-up review on 1c8972a3 caught a second Major: markSessionArchivedFromHub silently returned on updateSessionMetadata errors and on exhausted version-retry attempts, which would let /archive claim 200 OK while the row stayed unarchived. That regresses the #916 contract that non-RPC failures still surface as 5xx.

Pushed 123c625e:

  • Both fall-through paths in markSessionArchivedFromHub now throw, matching the contract of the sibling writers (renameSession, mergeSessions).
  • Two new unit tests in sessionModel.test.ts spy on store.sessions.updateSessionMetadata to force result: 'error' and persistent result: 'version-mismatch' and assert the helper throws.
  • No new route test needed: the existing still surfaces a 5xx for non-RPC errors (e.g. DB write failure) test at sessions.test.ts:1015 already covers route-level 500 propagation for any error from archiveSession.
  • bun typecheck clean, 984/985 tests pass (same pre-existing runner.integration.test.ts version-mismatch flake that fails on upstream/main too).

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Runner-spawned sessions are mislabeled on hub restart — cli/src/runner/run.ts:468 now injects HAPI_DEFAULT_ARCHIVE_REASON='Stopped by runner' into every runner-spawned child. Because runnerLifecycle uses that env var as the child’s default archive reason, a systemd/hub restart that SIGTERMs the runner and its children directly still archives those child sessions as “Stopped by runner”, not “Hub restart”. That reintroduces audit-trail pollution for the same restart-cascade path this PR is trying to fix; the runner’s own SIGTERM handler only shuts the runner down and does not selectively restamp already-spawned child env before the external signal reaches them.
    Suggested fix:
    // Keep the lifecycle default as Hub restart for externally delivered SIGTERM.
    // Stamp runner-owned stops at the kill sites instead of via static spawn env.
    const stopSession = (sessionId: string): boolean => {
        // before killProcessByChildProcess(...)
        void session.childProcess?.send?.({
            type: 'set-archive-reason',
            reason: 'Stopped by runner'
        })
        void killProcessByChildProcess(session.childProcess)
        return true
    }

Questions

  • None.

Summary

  • Review mode: follow-up after new commits
  • The previous markSessionArchivedFromHub persistence failure finding is addressed, but the follow-up runner env override creates a new audit-attribution regression for hub-restart SIGTERM cascades.

Testing

  • Not run (automation). Suggested coverage: runner-spawned child receiving plain SIGTERM without a stop-session request archives as Hub restart; stop-session/webhook-timeout paths archive as Stopped by runner.

HAPI Bot

Comment thread cli/src/runner/run.ts Outdated
// runner. Genuine user actions (web Archive via KillSession
// RPC, terminal Ctrl-C via SIGINT) still override to
// 'User terminated' in runnerLifecycle.
HAPI_DEFAULT_ARCHIVE_REASON: 'Stopped by runner'

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Major] This static env override applies to every runner-spawned child, including children killed by the same external hub/systemd restart cascade this PR is trying to classify as Hub restart. runnerLifecycle reads HAPI_DEFAULT_ARCHIVE_REASON as the default, and the runner's own SIGTERM handler only shuts the runner down; it does not restamp already-spawned children before the external SIGTERM reaches them. Result: runner-spawned sessions in the restart cascade are archived as Stopped by runner, polluting the audit trail again.

Suggested fix:

// Keep the child default as Hub restart for externally delivered SIGTERM.
// Stamp runner-owned stops at the kill sites instead of via static spawn env.
const stopSession = (sessionId: string): boolean => {
    // before killProcessByChildProcess(...)
    void session.childProcess?.send?.({
        type: 'set-archive-reason',
        reason: 'Stopped by runner'
    })
    void killProcessByChildProcess(session.childProcess)
    return true
}

Reverts `1c8972a3`. Bot review round 3 surfaced that the env-on-spawn
approach (the bot's own round-1 suggestion shape) mislabels
hub-restart-cascade SIGTERMs against runner-spawned children: systemd
killcgroup on `hapi-runner.service` stop sends SIGTERM to all
runner-children directly, and those would archive as 'Stopped by runner'
instead of 'Hub restart'.

The two suggestions are mutually incompatible without adding an IPC
channel (stdio: 'ipc' on spawn) so the runner can stamp
setArchiveReason via childProcess.send() before SIGTERMing. That is a
refactor, not a smallest-defensible change.

Going back to the simple shape: SIGTERM default is 'Hub restart' for
everyone, runner-internal stop paths share that label. The
audit-trail-correctness criterion from the tiann#914 issue is met
(SIGTERM no longer falsely labels as 'User terminated'). Finer
attribution between cascade vs runner-stop is deferred as a follow-up.

Refs tiann#914.

Co-authored-by: Cursor <cursoragent@cursor.com>
@heavygee

Copy link
Copy Markdown
Collaborator Author

Bot review round 3 on `123c625e` flagged that `HAPI_DEFAULT_ARCHIVE_REASON='Stopped by runner'` on spawn mislabels hub-restart-cascade SIGTERMs against runner-spawned children: systemd cgroup-kill on `hapi-runner.service` stop hits runner-children directly with SIGTERM, and those now archive as 'Stopped by runner' instead of the 'Hub restart' that the cascade narrative wants.

This is the mirror image of the bot's round-1 finding. The two suggestions are mutually incompatible without adding an IPC channel (`stdio: 'ipc'` on spawn + `childProcess.send()`) so the runner can stamp `setArchiveReason` via IPC before SIGTERMing - which is a refactor, not a smallest-defensible change.

Pushed `a97b9dca` to revert the env-var override. The SIGTERM default is now plain 'Hub restart' for everyone, including runner-internal stop paths (`hapi runner stop-session`, webhook-timeout cleanup, orphan cleanup). The original #914 acceptance criterion - "Hub-restart cascade archives recorded with a reason that is NOT 'User terminated'" - is still met. Finer attribution between cascade vs runner-stop is deferred as a follow-up that needs the IPC channel.

`bun typecheck` clean. `bun run test` 982/983 (same pre-existing `runner.integration.test.ts` flake on upstream/main).

@tiann happy to add the IPC channel as a separate PR if you'd rather have the finer attribution in-flight; just wanted to keep this bundle scoped to the four issues it's titled for.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Clean completions are archived as hub restarts — changing the lifecycle default to Hub restart means every normal agent-loop completion that only calls setSessionEndReason('completed') now writes archiveReason: 'Hub restart' during archiveAndClose. For example, Codex/Claude/Cursor/Gemini/Kimi/OpenCode all set the end reason to completed before cleanupAndExit, but none set an archive reason on that path, so successful exits pollute the audit trail as restart cascades. Evidence: cli/src/agent/runnerLifecycle.ts:46 (changed default), with the write at cli/src/agent/runnerLifecycle.ts:59 and completion call sites such as cli/src/codex/runCodex.ts:377.
    Suggested fix:
    const setSessionEndReason = (reason: SessionEndReason) => {
        sessionEndReason = reason
        if (reason === 'completed' && archiveReason === 'Hub restart') {
            archiveReason = 'Session completed'
        }
    }

Questions

  • None.

Summary

  • Review mode: follow-up after new commits
  • The previous runner env override finding is addressed, but the new lifecycle default still leaks into clean completion paths and misattributes successful exits as hub restarts.

Testing

  • Not run (automation). Suggested coverage: a lifecycle cleanup after setSessionEndReason('completed') writes a non-restart archive reason while SIGTERM/default cleanup still writes Hub restart.

HAPI Bot

// the runner can stamp `setArchiveReason` before SIGTERMing; tracked as
// a follow-up to keep this PR focussed on the user-action lie that
// motivated #914.
let archiveReason = 'Hub restart'

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Major] This changed default is also used by ordinary cleanup() / cleanupAndExit() calls, not only by the SIGTERM handler. The clean completion paths in the launchers set only sessionEndReason to completed before cleanup (for example runCodex.ts:377), and archiveAndClose persists whatever archiveReason currently holds. Result: successful agent exits now get archived with archiveReason: 'Hub restart', which pollutes the audit trail this PR is trying to correct.

Suggested fix:

const setSessionEndReason = (reason: SessionEndReason) => {
    sessionEndReason = reason
    if (reason === 'completed' && archiveReason === 'Hub restart') {
        archiveReason = 'Session completed'
    }
}

Addresses bot review round 4 of tiann#923 (Major): every agent runner
(runClaude, runCodex, runCursor, runGemini, runKimi, runOpencode)
calls setSessionEndReason('completed') on the natural exit path
without touching archiveReason. With the SIGTERM default flipped to
'Hub restart', clean completions were now archived as restart
cascades.

Fix: setSessionEndReason flips archiveReason to 'Session completed'
when it transitions to 'completed' AND no caller has already overridden
the archive reason. This covers all six agent runners with a single
setter change (no per-runner edits).

Two new tests cover the natural-completion default and the override
precedence (explicit setArchiveReason still wins).

Refs tiann#914.

Co-authored-by: Cursor <cursoragent@cursor.com>
@heavygee

Copy link
Copy Markdown
Collaborator Author

Round 4 bot finding is legitimate and was pre-existing — even before this PR, clean completions wrote `archiveReason: 'User terminated'` (the OLD default), which was also wrong. My default-flip just changed the noise from one wrong label to another. Worth fixing while we're here.

Verified all six agent runners call `setSessionEndReason('completed')` before `cleanupAndExit()` on the natural exit path without setting an archive reason: `runClaude.ts:461`, `runCodex.ts:377`, `runCursor.ts:203`, `runGemini.ts:195`, `runKimi.ts:185`, `runOpencode.ts:308`.

Pushed `98b10319`. Applied the bot's suggested shape: the setter now flips `archiveReason` to 'Session completed' when transitioning to 'completed' AND no caller has overridden first. One change covers all six runners. Two new tests cover the natural-completion default and the override-precedence guard (explicit `setArchiveReason('User terminated')` before `setSessionEndReason('completed')` still wins).

`bun typecheck` clean, 984/985 (same pre-existing `runner.integration.test.ts` flake).

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • No issues found.

Questions

  • None.

Summary

  • Review mode: follow-up after new commits
  • No high-confidence correctness, security, data-loss, performance, or maintainability issues found in the current diff. The previous bot finding about clean completions being archived as hub restarts is addressed in this head.

Testing

  • Not run (automation).

HAPI Bot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant