fix(sdk): widen MessageBuilder truncation candidate set and tie budget to maxInputTokens (CLINE-2191) by robinnewhouse · Pull Request #10747 · cline/cline

robinnewhouse · 2026-05-14T05:12:42Z

Related Issue

Issue: CLINE-2191 — Widen MessageBuilder truncation candidate set (Layer A)
Stacked on: #10740 — please merge that first.
Sibling follow-up: CLINE-2192 — Absolute hard guarantee (Layer B). This PR does NOT guarantee the request fits.

Description

After #10739 (allowlist) and #10740 (protected-tail trim), the post-compaction preservation set inside the in-flight turn still contains the typed user prompt, the last assistant message, and any in-flight tool_use verbatim. Each can individually exceed any context window. MessageBuilder.buildForApi already has the middle-truncation infrastructure (truncateMiddleByChars, truncateMiddleToBytes, an aggregate-budget pass at truncateToTotalTextBudget). It just doesn't see enough block types and runs against a hardcoded 6 MB cap rather than the model's actual maxInputTokens.

This PR is the preventive layer — the typical over-budget request stops being a 400 and becomes a truncated-but-shipped request. The brick-wall guarantee for adversarial inputs is sibling CLINE-2192, which stacks on top of this one. We are not over-selling: this PR does NOT yet meet the "Ever EVER" bar.

Changes

In sdk/packages/core/src/session/services/message-builder.ts:

collectTruncationCandidates widened. In addition to the existing tool_result candidates it now collects user text, assistant text, thinking block text, and top-level file block content. redacted_thinking is skipped (placeholder string). tool_use blocks are skipped — a JSON-aware structural truncator that avoids corrupting tool_use_id or breaking the input JSON shape is Layer B's responsibility.
Aggregate budget tied to maxInputTokens. buildForApi(messages, { maxInputTokens }) now derives the cap from the model's actual maxInputTokens (× CHARS_PER_TOKEN = 3). When maxInputTokens is not provided the constructor's hardcoded 6 MB default still applies, so legacy callers behave identically.
Deterministic sort. The largest-first sort gains an insertion-order tiebreaker so equal-byte candidates always truncate in the same order. Layer B relies on the same property.

In sdk/packages/core/src/runtime/orchestration/session-runtime-orchestrator.ts:

createRuntimeHooks and createRuntimePrepareTurn now thread modelInfo.maxInputTokens through prepareMessagesForModelRequest → prepareProviderMessagesForApi → buildForApi. The cap reflects the model that is actually being called, not a 6 MB approximation.

In sdk/packages/shared/src/llms/tokens.ts + index.ts + index.browser.ts:

CHARS_PER_TOKEN was previously private. Exported alongside the existing estimateTokens.

Test Procedure

Seven new cases in sdk/packages/core/src/session/services/message-builder.test.ts:

truncates user text blocks under the aggregate budget (CLINE-2191) — 5 MB user text, 500 KB budget, asserts truncation marker and final size.
truncates assistant text and thinking blocks under the aggregate budget (CLINE-2191) — 2 MB text + 1 MB thinking, 250 KB budget, asserts both truncated.
truncates top-level file blocks under the aggregate budget (CLINE-2191) — 4 MB file block, 200 KB budget.
skips tool_use input bodies (CLINE-2191, deferred to Layer B) — 4 MB tool_use.input.body stays untouched. Doc-test for the deferral.
derives the aggregate budget from maxInputTokens when provided (CLINE-2191) — buildForApi(messages, { maxInputTokens: 100_000 }) → 300 000 byte cap.
falls back to the constructor default budget when maxInputTokens is absent (CLINE-2191) — legacy callers unchanged.
produces deterministic output for equal-byte-length candidates (CLINE-2191) — two equal-byte candidates → byte-identical output across two runs.

Verification:

sdk/packages/core $ bun run typecheck:smoke   # clean
sdk/packages/core $ bun run test:unit -- message-builder.test
  ✓ 16 tests passed (9 existing + 7 new)
sdk/packages/core $ bun run test:unit
  Test Files  102 passed | 1 skipped (103)
       Tests  920 passed | 4 skipped (924)

What this PR still does NOT fix (intentional, deferred to CLINE-2192)

tool_use.input truncation. JSON-aware structural truncator that drills into string values without corrupting tool_use_id or input shape.
Adversarial inputs. A pathological transcript of 50k small blocks each below the MIN_TOTAL_BUDGET_TOOL_RESULT_BYTES = 8_000 floor would still overshoot — the largest-first heuristic can't shrink below the floor. Layer B adds the final byte-cap enforcement.
User-visible status notice and task.emergency_truncation telemetry. Only fires when we enter genuinely degraded mode; Layer B's concern.
Hard-error path. Policy is "degrade, never error" — neither layer hard-errors.

Type of Change

🐛 Bug fix (non-breaking change which fixes an issue)

Pre-flight Checklist

Changes are limited to a single feature, bugfix or chore
Tests are passing and code is formatted and linted
I have reviewed contributor guidelines

Additional Notes

The orchestrator threading is a small public-API change (MessageBuilder.buildForApi gains an optional second argument). Existing tests construct MessageBuilder directly without passing maxInputTokens and behave unchanged. The 6 MB hardcoded default stays as the fallback floor so any caller that doesn't yet know about maxInputTokens gets the pre-PR behavior.

Diagram

PR #10747 / CLINE-2191
Layer A: heuristic total text budget in MessageBuilder

Provider messages after per-block truncation:

  text blocks
  file blocks
  tool_result text
  other shrinkable text
        |
        v
  measure total text bytes/chars
        |
        v
  budget = maxInputTokens * 3 chars/token
        |
        v
  over budget?
        |
   +----+----+
   |         |
  no        yes
   |         |
   v         v
 return   choose largest shrinkable blocks first
          middle-truncate them
          skip signed/unshrinkable thinking
              |
              v
          provider payload usually fits

linear-code · 2026-05-14T05:12:46Z

CLINE-2191

greptile-apps · 2026-05-14T05:18:00Z

Greptile Summary

This PR widens MessageBuilder's truncation candidate set beyond tool_result blocks to also include top-level text and file blocks, and ties the aggregate text budget to the model's actual maxInputTokens (×3 chars/token) rather than a hardcoded 6 MB cap. modelInfo.maxInputTokens is now threaded from the orchestrator through prepareMessagesForModelRequest → prepareProviderMessagesForApi → buildForApi.

thinking and redacted_thinking blocks are intentionally excluded from candidates because truncating their text would invalidate the API signature; they are also removed from countMessageTextBytes so they no longer inflate the overflow calculation.
tool_use.input truncation is explicitly deferred to Layer B (CLINE-2192), and CHARS_PER_TOKEN is promoted to a public export so both layers share the same chars-per-token constant.

Confidence Score: 5/5

Safe to merge; both previously-flagged issues (thinking bytes inflating the uncollectable overflow, and thinking-text mutation invalidating the API signature) are cleanly resolved.

The two concrete defects called out in prior review rounds are both addressed: thinking blocks are removed from countMessageTextBytes so they no longer make the budget goal unreachable, and thinking blocks are skipped in collectTruncationCandidates so signatures are never invalidated. The orchestrator threading is mechanical and touches only the call sites. Remaining notes are test-coverage quality and a PR-description inaccuracy, neither of which affects runtime behaviour.

message-builder.test.ts — the file-block aggregate-budget test exercises the per-block truncation path rather than the aggregate path it describes; worth correcting before CLINE-2192 stacks on top.

Important Files Changed

Filename	Overview
sdk/packages/core/src/session/services/message-builder.ts	Core logic changes: countMessageTextBytes no longer counts thinking bytes (fixing the previously-flagged unkillable-bytes issue), collectTruncationCandidates now adds text and file blocks, truncateToTotalTextBudget derives budget from maxInputTokens, and the sort gains a deterministic tiebreaker.
sdk/packages/core/src/session/services/message-builder.test.ts	Seven new CLINE-2191 test cases added. The file-block aggregate-budget test uses maxToolResultChars (50 K) smaller than the budget (200 K), so truncation is done by the per-block pass rather than the aggregate pass it intends to cover.
sdk/packages/core/src/runtime/orchestration/session-runtime-orchestrator.ts	Threads modelInfo.maxInputTokens into createRuntimeHooks and createRuntimePrepareTurn → prepareMessagesForModelRequest → prepareProviderMessagesForApi → buildForApi. Change is mechanical and correct.
sdk/packages/shared/src/llms/tokens.ts	CHARS_PER_TOKEN promoted from module-private to public export; no logic change.
sdk/packages/shared/src/index.ts	Re-exports CHARS_PER_TOKEN alongside the existing estimateTokens export.
sdk/packages/shared/src/index.browser.ts	Mirrors the index.ts CHARS_PER_TOKEN re-export for browser consumers.

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
sdk/packages/core/src/session/services/message-builder.test.ts:590-591
The test name promises coverage of the **aggregate** budget path for `file` blocks, but it doesn't actually exercise that path. `MessageBuilder(50_000, undefined, 200_000)` sets `maxToolResultChars = 50_000`, so `transformBlock` → `truncateMiddle` trims the 4 MB file to ≤50 KB during the per-block pass. Because 50 KB < 200 KB budget, `truncateToTotalTextBudget` short-circuits on the early-return guard and never enters the loop. Setting `maxToolResultChars` well above the aggregate budget (e.g., `new MessageBuilder(10_000_000, undefined, 200_000)`) would make the aggregate pass the active code path and give this test its claimed coverage.

```suggestion
	it("truncates top-level file blocks under the aggregate budget (CLINE-2191)", () => {
		// maxToolResultChars must exceed the aggregate budget so the per-block
		// pass does not truncate first; otherwise truncateToTotalTextBudget
		// short-circuits and the aggregate path is never exercised.
		const builder = new MessageBuilder(10_000_000, undefined, 200_000);
```

### Issue 2 of 2
sdk/packages/core/src/session/services/message-builder.ts:893-907
**PR description vs. implementation mismatch on `thinking` blocks**

The "Changes" section of the PR description states the candidate set "now collects … `thinking` block text", but the code (and the new tests) correctly show that all `thinking` blocks — signed or unsigned — are **skipped**. The code comment and test `"truncates assistant text but preserves signed thinking blocks"` both confirm the intentional skip.

This is purely a description inaccuracy, but it could mislead a reviewer or a developer reading the PR later into believing that `thinking` content is being middle-truncated today. Worth correcting in the PR body so the record is accurate for CLINE-2192 reviewers who will stack on top of this.

_{Reviews (5): Last reviewed commit: "fix(sdk): align layer-a thinking budget ..." | Re-trigger Greptile}

robinnewhouse · 2026-05-14T20:11:13Z

@greptileai review

robinnewhouse · 2026-05-14T20:31:08Z

@greptileai review

robinnewhouse · 2026-05-14T20:49:19Z

@greptileai review

…t to maxInputTokens (CLINE-2191) CLINE-2191 Layer A. After PR #10739 (allowlist) and PR #10740 (protected-tail trim) the preservation set inside the in-flight turn still included the typed user prompt, the last assistant message, and any in-flight tool_use verbatim. Each can individually exceed any context window. This PR closes the common case. Changes to MessageBuilder: (1) collectTruncationCandidates also collects user text, assistant text, thinking blocks, and top-level file blocks. tool_use input bodies remain untouched here — a JSON-aware structural truncator that avoids corrupting tool_use_id or breaking input JSON shape is Layer B (CLINE-2192). (2) buildForApi accepts an optional maxInputTokens; when present the aggregate budget becomes maxInputTokens * CHARS_PER_TOKEN, otherwise it falls back to the hardcoded 6 MB default so legacy callers keep their behavior. (3) The largest-first sort gains an insertion-order tiebreaker so the output is deterministic across runs (Layer B will rely on this). Orchestrator: createRuntimeHooks and createRuntimePrepareTurn now thread modelInfo.maxInputTokens through to prepareProviderMessagesForApi -> buildForApi so the cap reflects the model that is actually being called. Shared package: CHARS_PER_TOKEN was previously private inside @cline/shared/llms/tokens.ts. Exported alongside the existing estimateTokens. Tests: 7 new cases in message-builder.test.ts. 913 -> 920 @cline/core unit tests pass. typecheck:smoke clean. This is the preventive layer. It does NOT yet provide the hard guarantee "the outbound request will never exceed the context window." That guarantee lives in CLINE-2192 (Layer B) which stacks on top of this PR.

robinnewhouse · 2026-05-14T21:37:06Z

@greptileai review

robinnewhouse mentioned this pull request May 14, 2026

fix(sdk): trim oldest completed tool pairs from protected tail in basic compaction (CLINE-2185) #10740

Open

10 tasks

greptile-apps Bot reviewed May 14, 2026

View reviewed changes

Comment thread sdk/packages/core/src/session/services/message-builder.ts Outdated

robinnewhouse mentioned this pull request May 14, 2026

fix(sdk): enforce hard provider request byte budget (CLINE-2192) #10759

Draft

10 tasks

robinnewhouse force-pushed the robin/cline-2185-compaction-cant-shrink-in-flight-turn branch from d94c054 to 3dae349 Compare May 14, 2026 20:02

robinnewhouse force-pushed the robin/cline-2191-widen-messagebuilder-truncation-candidate-set branch from 5ee594e to 945bd1e Compare May 14, 2026 20:05

robinnewhouse force-pushed the robin/cline-2185-compaction-cant-shrink-in-flight-turn branch from 3dae349 to 0438d29 Compare May 14, 2026 20:26

robinnewhouse force-pushed the robin/cline-2191-widen-messagebuilder-truncation-candidate-set branch from 945bd1e to 91f9c04 Compare May 14, 2026 20:26

robinnewhouse force-pushed the robin/cline-2191-widen-messagebuilder-truncation-candidate-set branch from 91f9c04 to fd7a192 Compare May 14, 2026 20:43

robinnewhouse force-pushed the robin/cline-2191-widen-messagebuilder-truncation-candidate-set branch from fd7a192 to 1101682 Compare May 14, 2026 21:24

robinnewhouse added 3 commits May 14, 2026 14:31

fix(sdk): preserve thinking blocks during aggregate truncation

88095e6

fix(sdk): align layer-a thinking budget accounting

b9ec39c

robinnewhouse force-pushed the robin/cline-2191-widen-messagebuilder-truncation-candidate-set branch from 1101682 to b9ec39c Compare May 14, 2026 21:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sdk): widen MessageBuilder truncation candidate set and tie budget to maxInputTokens (CLINE-2191)#10747

fix(sdk): widen MessageBuilder truncation candidate set and tie budget to maxInputTokens (CLINE-2191)#10747
robinnewhouse wants to merge 3 commits into
robin/cline-2185-compaction-cant-shrink-in-flight-turnfrom
robin/cline-2191-widen-messagebuilder-truncation-candidate-set

robinnewhouse commented May 14, 2026 •

edited

Loading

Uh oh!

linear-code Bot commented May 14, 2026

Uh oh!

greptile-apps Bot commented May 14, 2026 •

edited

Loading

Important Files Changed

Uh oh!

Uh oh!

robinnewhouse commented May 14, 2026

Uh oh!

robinnewhouse commented May 14, 2026

Uh oh!

robinnewhouse commented May 14, 2026

Uh oh!

robinnewhouse commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

robinnewhouse commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issue

Description

Changes

Test Procedure

What this PR still does NOT fix (intentional, deferred to CLINE-2192)

Type of Change

Pre-flight Checklist

Additional Notes

Diagram

Uh oh!

linear-code Bot commented May 14, 2026

Uh oh!

greptile-apps Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

Uh oh!

robinnewhouse commented May 14, 2026

Uh oh!

robinnewhouse commented May 14, 2026

Uh oh!

robinnewhouse commented May 14, 2026

Uh oh!

robinnewhouse commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

robinnewhouse commented May 14, 2026 •

edited

Loading

greptile-apps Bot commented May 14, 2026 •

edited

Loading