fix: transition Claude Code session to idle when API retries are exhausted#60
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the Claude Code session monitor to correctly transition sessions to idle when Claude Code terminates due to an exhausted-retries API failure (where no Stop hook is emitted), by detecting a synthetic JSONL assistant message marker.
Changes:
- Extend JSONL parsing to recognize Claude Code’s synthetic API error termination message (
isApiErrorMessage). - Update Claude Code session sync logic to transition
running/waiting→idlewhen the last JSONL line indicates terminal API error and is newer than the session’supdated_at. - Add test coverage for the new transition behavior, including recency and “last-line only” guards and ignoring non-terminal retry logs.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| internal/monitor/monitor.go | Detect synthetic terminal API-error JSONL line and CAS-transition session status to idle. |
| internal/monitor/monitor_test.go | Add tests covering running/waiting → idle on terminal API error and guard cases. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
256
to
+306
| if len(lines) > 0 { | ||
| var lastLine jsonlLine | ||
| var maxTimestamp string | ||
| for _, raw := range lines { | ||
| var line jsonlLine | ||
| if err := json.Unmarshal([]byte(raw), &line); err != nil { | ||
| continue | ||
| } | ||
| lastLine = line | ||
| if line.Timestamp != "" && (maxTimestamp == "" || isAfter(line.Timestamp, maxTimestamp)) { | ||
| maxTimestamp = line.Timestamp | ||
| } | ||
| } | ||
|
|
||
| // Interruption check takes priority over waiting→running. | ||
| if isInterruptionLine(lastLine) && isAfter(lastLine.Timestamp, sess.UpdatedAt) { | ||
| st := status.Status(sess.Status) | ||
| if st == status.Running || st == status.Waiting { | ||
| if err := queries.UpdateSessionStatusIfUnchanged(ctx, sqlc.UpdateSessionStatusIfUnchangedParams{ | ||
| Status: string(status.Idle), | ||
| UpdatedAt: timestamp.Now(), | ||
| WaitingSince: "", | ||
| Name: sess.Name, | ||
| Path: sess.Path, | ||
| Status_2: string(st), | ||
| }); err != nil { | ||
| return fmt.Errorf("update status to idle: %w", err) | ||
| } | ||
| } | ||
| return nil | ||
| } | ||
|
|
||
| // API error termination (e.g. 529 Overloaded after retry exhaustion): | ||
| // Claude Code does not fire a Stop hook in this case, so the session | ||
| // would otherwise remain stuck in `running`. | ||
| if isAPIErrorLine(lastLine) && isAfter(lastLine.Timestamp, sess.UpdatedAt) { | ||
| st := status.Status(sess.Status) | ||
| if st == status.Running || st == status.Waiting { | ||
| if err := queries.UpdateSessionStatusIfUnchanged(ctx, sqlc.UpdateSessionStatusIfUnchangedParams{ | ||
| Status: string(status.Idle), | ||
| UpdatedAt: timestamp.Now(), | ||
| WaitingSince: "", | ||
| Name: sess.Name, | ||
| Path: sess.Path, | ||
| Status_2: string(st), | ||
| }); err != nil { | ||
| return fmt.Errorf("update status to idle on api error: %w", err) | ||
| } | ||
| } | ||
| return nil | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When the Claude API returns a terminal error (e.g. 529 Overloaded) and retries are exhausted, Claude Code writes a synthetic assistant message with
isApiErrorMessage: trueto the JSONL but does not fire aStophook. As a result, the muxac session status stayed stuck atrunningeven though the agent was no longer processing.The monitor now inspects the last JSONL line for that synthetic API-error marker, and when found with a timestamp newer than the session's
updated_at, transitionsrunning/waitingsessions toidle(using the existing CAS update for safe concurrent writes). Includes tests covering the running/waiting transitions, the recency guard, the last-line guard, and the case where intermediatetype:"system",subtype:"api_error"retry logs must not trigger the transition.