Skip to content

fix: prevent task goroutine context from being canceled by HTTP request lifecycle#900

Open
xlyoung wants to merge 1 commit into
mark3labs:mainfrom
xlyoung:fix/task-context-cancellation
Open

fix: prevent task goroutine context from being canceled by HTTP request lifecycle#900
xlyoung wants to merge 1 commit into
mark3labs:mainfrom
xlyoung:fix/task-context-cancellation

Conversation

@xlyoung

@xlyoung xlyoung commented May 31, 2026

Copy link
Copy Markdown

Fixes #897

Problem

When a tool registered with WithTaskSupport(TaskSupportOptional) or AddTaskTool is called asynchronously via streamable HTTP, the task context was derived from the HTTP request context (r.Context()). After the HTTP handler returned the CreateTaskResult response, Go HTTP server canceled r.Context(), which cascaded into the task context and killed the handler goroutine.

Fix

Use context.WithoutCancel(ctx) to detach from the HTTP request lifecycle while preserving context values (session metadata, etc.).

The task context is now only canceled by:

  • The task own timeout/TTL
  • Explicit tasks/cancel from the client
  • Server shutdown

Both executeTaskTool and executeRegularToolAsTask are fixed.

Testing

Updated the corresponding test to verify internal handler cancellation (timeout-based) rather than relying on parent context cancellation. All existing tests pass with -race.

Summary by CodeRabbit

  • Bug Fixes
    • Tasks now continue executing even when the initiating request ends, improving reliability for long-running operations.

…st lifecycle

Fixes mark3labs#897

When a tool registered with WithTaskSupport(TaskSupportOptional) or
AddTaskTool is called asynchronously via streamable HTTP, the task
context was derived from the HTTP request context (r.Context()). After
the HTTP handler returned the CreateTaskResult response, Go's HTTP
server canceled r.Context(), which cascaded into the task context and
killed the handler goroutine.

Fix: Use context.WithoutCancel(ctx) to detach from the HTTP request
lifecycle while preserving context values (session metadata, etc.).
The task context is now only canceled by:
- The task's own timeout/TTL
- Explicit tasks/cancel from the client
- Server shutdown

Both executeTaskTool and executeRegularToolAsTask are fixed.

Updated the corresponding test to verify internal handler cancellation
(timeout-based) rather than relying on parent context cancellation,
which is no longer propagated to task goroutines.
@mark-iii-labs-huly

Copy link
Copy Markdown

Connected to Huly®: MCP_G-464

@coderabbitai

coderabbitai Bot commented May 31, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

Task goroutines for asynchronous tool execution are now detached from HTTP request cancellation using context.WithoutCancel(), allowing them to continue after the handler returns. Context detachment is applied in both task execution paths, while preserving context values and supporting explicit task cancellation via stored cancel functions. Test validation confirms the handler self-cancels independently.

Changes

Task context detachment from HTTP lifecycle

Layer / File(s) Summary
Context detachment in task execution paths
server/server.go
Both executeTaskTool (lines 2023–2029) and executeRegularToolAsTask (lines 2118–2124) now create task context via context.WithoutCancel(ctx) wrapped with context.WithCancel, preventing HTTP request cancellation from cascading into task goroutines while retaining explicit task cancellation via cancelFunc.
Test validation of independent task cancellation
server/task_tool_test.go
The "handler returns context.Canceled before tasks/cancel called" subtest (lines 535–602) is updated to verify handler self-cancellation: tool handler now creates a timeout context and returns its error; test invokes handleToolCall with t.Context() (non-cancellable parent), polls task to terminal state, and asserts Cancelled status with context deadline exceeded message. Inline comments (lines 535–543, 552) clarify the internal-cancellation mechanism and context-detachment behavior.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • mark3labs/mcp-go#880: Adds panic recovery handling in executeTaskTool, modifying the same async task execution goroutine logic affected by this PR's context detachment change.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the primary change: preventing task goroutine context from being canceled by the HTTP request lifecycle, which is the main objective of this PR.
Description check ✅ Passed The description includes problem statement, fix details, affected functions, and testing information, covering all essential aspects required by the template structure.
Linked Issues check ✅ Passed The code changes implement the exact fix specified in issue #897: using context.WithoutCancel to detach task context from HTTP request lifecycle while preserving context values in both executeTaskTool and executeRegularToolAsTask.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing issue #897: modifications to task context handling in server.go and corresponding test updates in server/task_tool_test.go align with the stated objectives.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/server.go`:
- Around line 2023-2027: The task context is currently created with
context.WithCancel(context.WithoutCancel(ctx)) which removes parent deadlines so
TTL cleanup only deletes bookkeeping; fix by making the TTL actually cancel
execution: in executeTaskTool / executeRegularToolAsTask create taskCtx from a
non-request parent (e.g., context.Background()) and apply the TTL via
context.WithTimeout or context.WithDeadline to produce a cancellable context
assigned to taskCtx (or ensure scheduleTaskCleanup calls entry.cancelFunc() /
cancelTask before deleting s.tasks); update scheduleTaskCleanup to invoke
entry.cancelFunc() when TTL expires (then remove bookkeeping) so background
goroutines are actually stopped.

In `@server/task_tool_test.go`:
- Around line 533-543: The test "task tool handler returns context.Canceled
before tasks/cancel called" no longer verifies the `#897` regression because the
handler cancels an inner context (innerCtx) while the parent/request context is
never cancelled; add an additional subcase that explicitly cancels the
parent/request context immediately after calling handleToolCall (simulate parent
cancellation), then assert the task is not marked cancelled/failed until the
handler's own cancellation path runs (i.e., the task remains alive until
innerCtx times out); apply the same change to the similar test block around
lines 577-601 so both scenarios (parent-cancel and handler-internal-cancel) are
covered using the existing test helpers and the same assertions on task state.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: cbed8717-457f-4f6c-8212-4bc266995e49

📥 Commits

Reviewing files that changed from the base of the PR and between 481f056 and f1408f6.

📒 Files selected for processing (2)
  • server/server.go
  • server/task_tool_test.go

Comment thread server/server.go
Comment on lines +2023 to +2027
// Create cancellable context for this task execution.
// Use WithoutCancel to detach from the HTTP request lifecycle so the task
// continues running after the HTTP handler returns. Context values (session
// metadata, etc.) are preserved.
taskCtx, cancel := context.WithCancel(context.WithoutCancel(ctx))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🌐 Web query:

In Go's standard library, does context.WithoutCancel preserve the parent context's deadline or Done channel?

💡 Result:

In Go's standard library, context.WithoutCancel does not preserve the parent context's deadline, and its Done channel is nil [1][2][3]. According to the official documentation, context.WithoutCancel returns a derived context that points to the parent context but is explicitly designed to not be canceled when the parent is canceled [1][3]. Specifically, the returned context has the following properties: - Deadline: It returns no deadline [1][2]. - Done channel: Its Done channel is nil [1][2]. - Err: It returns no error [1][2]. - Cause: Calling Cause on it returns nil [1][2]. While it does not carry over the cancellation or deadline signals from the parent, it does maintain the ability to access values stored in the parent context through its Value method, as it still points to the parent [4]. This makes it useful for scenarios where you need to preserve request-scoped values for background tasks while detaching those tasks from the lifetime of the original request [4].

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show the relevant segments around the referenced line ranges
sed -n '2000,2060p' server/server.go
echo "----"
sed -n '2080,2160p' server/server.go
echo "----"
sed -n '1920,2005p' server/server.go
echo "----"
sed -n '2480,2605p' server/server.go

# Locate createTask and task execution/cancellation usage
echo "==== createTask definition/uses ===="
rg -n "func (createTask|\\w*createTask)\\b|\\bcreateTask\\b" server/server.go
echo "----"
# Print the function(s) containing createTask, if found
rg -n "createTask\\(" server/server.go | head -n 20

Repository: mark3labs/mcp-go

Length of output: 12006


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "scheduleTaskCleanup\\b|cancelTask\\b|tasks/cancel|TaskStatusCancelled|cancelFunc\\b" server/server.go
echo "----"
# Show likely relevant implementations around those matches
# (Using approximate line windows from rg results via grep for line numbers)
python3 - <<'PY'
import re,subprocess,os,sys,json,math
from pathlib import Path
text=subprocess.check_output(["bash","-lc","rg -n \"scheduleTaskCleanup\\\\b|cancelTask\\\\b|tasks/cancel|TaskStatusCancelled|cancelFunc\\\\b\" server/server.go"],text=True)
lines=[]
for line in text.strip().splitlines():
    m=re.match(r"(\d+):", line)
    if m:
        lines.append(int(m.group(1)))
# print windows around each match
for ln in sorted(set(lines)):
    start=max(1,ln-30); end=ln+80
    print(f"\n===== server/server.go:{start}-{end} (around {ln}) =====")
    subprocess.run(["bash","-lc", f"sed -n '{start},{end}p' server/server.go"], check=False)
PY

Repository: mark3labs/mcp-go

Length of output: 53361


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Narrow, targeted inspection to avoid truncation
echo "===== executeTaskTool context-withoutcancel sites ====="
sed -n '2010,2055p' server/server.go
echo "----"
sed -n '2105,2175p' server/server.go

echo "===== createTask TTL scheduling call ====="
sed -n '2470,2565p' server/server.go

echo "===== cancelTask implementation ====="
sed -n '2660,2765p' server/server.go

Repository: mark3labs/mcp-go

Length of output: 9723


Ensure task TTL/timeout actually cancels background execution (not just bookkeeping)

  • executeTaskTool / executeRegularToolAsTask build taskCtx via context.WithCancel(context.WithoutCancel(ctx)); context.WithoutCancel removes the parent deadline and makes Done nil, so the task won’t stop when the request context ends.
  • Cancellation via tasks/cancel calls entry.cancelFunc() correctly (cancelTask), but scheduleTaskCleanup (on TTL expiry) only delete(s.tasks, taskID) / writes s.expiredTasks and never cancels the running goroutine.
  • This means work can continue running after TTL expiry and even after the task record is removed, leaving no reliable way to stop it via TTL.

Fix by applying the TTL as a deadline on the task context (from an appropriate non-request parent) and/or invoking entry.cancelFunc() (or cancelTask) when TTL cleanup triggers, before removing bookkeeping.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/server.go` around lines 2023 - 2027, The task context is currently
created with context.WithCancel(context.WithoutCancel(ctx)) which removes parent
deadlines so TTL cleanup only deletes bookkeeping; fix by making the TTL
actually cancel execution: in executeTaskTool / executeRegularToolAsTask create
taskCtx from a non-request parent (e.g., context.Background()) and apply the TTL
via context.WithTimeout or context.WithDeadline to produce a cancellable context
assigned to taskCtx (or ensure scheduleTaskCleanup calls entry.cancelFunc() /
cancelTask before deleting s.tasks); update scheduleTaskCleanup to invoke
entry.cancelFunc() when TTL expires (then remove bookkeeping) so background
goroutines are actually stopped.

Comment thread server/task_tool_test.go
Comment on lines 533 to +543
t.Run("task tool handler returns context.Canceled before tasks/cancel called", func(t *testing.T) {
// This test verifies that if a handler detects context cancellation
// (e.g., from parent context timeout) and returns ctx.Err() before
// (e.g., from its own internal timeout) and returns ctx.Err() before
// tasks/cancel is explicitly called, the task is still marked as cancelled
// rather than failed.
//
// Note: With the fix for #897 (context.WithoutCancel), the task context
// is now detached from the parent/HTTP request context. This means the
// handler must manage its own cancellation via the task's context, not
// rely on parent context cancellation. We test internal handler cancellation
// by having the handler cancel its own context via a timeout.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

This no longer tests the #897 regression.

The handler now cancels itself via innerCtx, but the parent/request context is never canceled. If the production code regressed to context.WithCancel(ctx), this subtest would still pass because the inner timeout fires first. Please keep a case that cancels the parent right after handleToolCall and asserts the task stays alive until its own cancellation path fires.

Also applies to: 577-601

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/task_tool_test.go` around lines 533 - 543, The test "task tool handler
returns context.Canceled before tasks/cancel called" no longer verifies the `#897`
regression because the handler cancels an inner context (innerCtx) while the
parent/request context is never cancelled; add an additional subcase that
explicitly cancels the parent/request context immediately after calling
handleToolCall (simulate parent cancellation), then assert the task is not
marked cancelled/failed until the handler's own cancellation path runs (i.e.,
the task remains alive until innerCtx times out); apply the same change to the
similar test block around lines 577-601 so both scenarios (parent-cancel and
handler-internal-cancel) are covered using the existing test helpers and the
same assertions on task state.

@ezynda3

ezynda3 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

@xlyoung can you address the coderabbit comments please?

@xlyoung

xlyoung commented Jun 4, 2026

Copy link
Copy Markdown
Author

Hi @ezynda3, thanks for the ping. Ive reviewed both CodeRabbit findings:

Comment 1: TTL cleanup doesnt cancel execution

Youre right that scheduleTaskCleanup only removes the bookkeeping entry without calling cancelFunc(). However, this is a pre-existing issue unrelated to #897 — the original code (context.WithCancel(ctx)) had the same TTL behavior. My PRs scope is strictly detaching task context from HTTP request lifecycle via context.WithoutCancel.

That said, I agree it should be fixed. Two options:

  1. In this PR — add entry.cancelFunc() call in scheduleTaskCleanup before deleting the task. Minimal change, keeps PR focused.
  2. As a follow-up — open a separate issue/PR since its orthogonal to Task goroutine context is canceled when HTTP response is flushed (executeRegularToolAsTask) #897.

Happy to go with either approach — what do you prefer?

Comment 2: Test doesnt verify #897 regression

Good catch. Ill add a subtest that:

  1. Calls handleToolCall with a cancellable parent context
  2. Cancels the parent immediately after
  3. Asserts the task stays alive (not cancelled/failed) until the handlers own internal timeout fires

This proves the context is actually detached from the parent, not just relying on the inner timeout. Ill push an update soon.

Let me know if you want me to proceed with the TTL fix in this PR or keep it as a separate follow-up.

@ezynda3 ezynda3 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review summary

context.WithoutCancel(ctx) is the right fix here — it detaches the task goroutine from the HTTP request lifecycle while preserving context values, and tasks/cancel / TTL / server shutdown remain the only cancellation paths. Applied to both executeTaskTool and executeRegularToolAsTask. Build, vet, go test ./... -race, golangci-lint, and staticcheck are all clean on HEAD; no regressions vs main.

One blocking item: the updated test file is not gofmt-clean. The project's AGENTS.md expects gofmt-clean code; CI's lint job didn't catch this because .golangci.yml doesn't enable a gofmt/gofumpt linter.

Blocking

server/task_tool_test.go — gofmt regression

Inside t.Run("task tool handler returns context.Canceled before tasks/cancel called", ...), the body from // Step 4: Call tool with task augmentation through the closing }) of the subtest lost one tab of indentation (single tab instead of double). The file compiles and tests pass, but gofmt -d server/task_tool_test.go rewrites ~40 lines.

Fix:

gofmt -w server/task_tool_test.go

Non-blocking suggestions

  1. Extract a helper. The new comment block + context.WithCancel(context.WithoutCancel(ctx)) is duplicated verbatim in executeTaskTool and executeRegularToolAsTask. A small helper like newDetachedTaskContext(ctx) (context.Context, context.CancelFunc) would centralize the WithoutCancel policy and prevent future drift.

  2. Test coverage for the tasks/cancel path. The updated subtest only exercises the internal-timeout cancellation path. Worth confirming a sibling subtest still drives cancellation via an explicit tasks/cancel RPC, so that route's mapping to TaskStatusCancelled stays covered after the WithoutCancel change.

  3. Test step renumbering. Comments now read 1, 2, 4, 5, 6 (Step 3 went away with cancelParent). Minor, but renumbering helps future readers.

What I ran

  • go build ./... — clean on base and HEAD
  • go vet ./... — clean on base and HEAD
  • go test ./... -race -count=1 — pass on base and HEAD
  • golangci-lint run (project config) — 0 issues
  • staticcheck ./... — clean
  • go mod tidy — no diff
  • gofmt -l server/task_tool_test.go server/server.goserver/task_tool_test.go flagged
  • Static security scan on + lines (secrets, exec, SQL, gob, unsafe, InsecureSkipVerify, math/rand, panic, debug leftovers) — all clean

Once gofmt is run, this is good to go from my side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Task goroutine context is canceled when HTTP response is flushed (executeRegularToolAsTask)

2 participants