Skip to content

fix(beta/policies): pair tool calls/results after history trim#2800

Merged
Lancetnik merged 3 commits into
ag2ai:mainfrom
obchain:fix/issue-2793
May 11, 2026
Merged

fix(beta/policies): pair tool calls/results after history trim#2800
Lancetnik merged 3 commits into
ag2ai:mainfrom
obchain:fix/issue-2793

Conversation

@obchain
Copy link
Copy Markdown
Contributor

@obchain obchain commented May 11, 2026

Why are these changes needed?

SlidingWindowPolicy and TokenBudgetPolicy only stripped orphan ToolResultsEvents at the head of the trimmed event list. When the matching ToolCallsEvent was trimmed but the ToolResultsEvent survived at a non-leading position — for example a mid-window orphan, or events carried over between agents that share a stream — the assembled messages were sent to providers like OpenAI and rejected with:

Error code: 400 - {'error': {'message': "Invalid parameter: messages with role 'tool' must be a response to a preceeding message with 'tool_calls'.", ...}}

Fix

New helper autogen/beta/policies/_pairing.py::ensure_tool_pairing scans the full trimmed list, collects surviving tool-call IDs from ModelResponse.tool_calls, and drops any ToolResultsEvent with no matching parent. Both SlidingWindowPolicy and TokenBudgetPolicy call it after their respective trimming step, replacing the prior leading-only while isinstance(trimmed[0], ToolResultsEvent) loop.

Commits

Split atomically:

  1. fix(policies): add tool-call pairing helper — helper module only.
  2. fix(sliding_window): drop orphan tool results — wire helper, add tests, Fixes SlidingWindowPolicy: incomplete tool call/result pairing causes OpenAI 400 errors #2793.
  3. fix(token_budget): drop orphan tool results — same bug in sibling policy, wire helper, add test.

Tests

uv run pytest test/beta/policies/ -q → 18 passed (previously 14). New cases:

  • sliding_window: mid-window orphan dropped; orphans at multiple positions dropped; paired call+result inside window preserved.
  • token_budget: mid-window orphan dropped.

Existing head-orphan tests continue to pass — behavior is a strict superset of the old leading-only logic.

Related issue number

Fixes #2793

Checks

  • I've included any doc changes needed (none required — internal helper, no public-API surface change).
  • I've added tests corresponding to the changes introduced in this PR.
  • I've made sure all auto checks have passed.

AI assistance

  • I understand the changes in this PR and can explain them in my own words.
  • I have verified that the PR description accurately reflects the actual diff.
  • If AI assistance was used, I reviewed, tested, and validated the generated code/text before submitting.

obchain added 3 commits May 11, 2026 13:09
Introduce ensure_tool_pairing(events): scan an event list, collect tool-call
IDs from surviving ModelResponse.tool_calls, and drop ToolResultsEvents with
no matching parent. Required by providers (OpenAI) that reject tool-role
messages without a preceding tool_calls message. No caller yet — wired in
the following commits.

Refs ag2ai#2793
SlidingWindowPolicy only stripped orphan ToolResultsEvents at the head of the
window. A ToolResultsEvent whose matching ToolCallsEvent was trimmed but that
survived deeper in the window (mid-window orphan, or events carried over
between agents on shared streams) passed through and caused providers (OpenAI)
to reject the assembled messages with:

    Invalid parameter: messages with role 'tool' must be a response
    to a preceeding message with 'tool_calls'.

Call ensure_tool_pairing on the trimmed window so orphans at any position
are dropped. Add tests for mid-window orphans, multi-position orphans, and
the paired-and-kept case.

Fixes ag2ai#2793
TokenBudgetPolicy carried the same head-only orphan-stripping logic as
SlidingWindowPolicy and shared the same bug: orphan ToolResultsEvents at
non-leading positions in the budgeted window survived and triggered
provider 400s.

Reuse ensure_tool_pairing on the retained event list. Add a mid-window
orphan test to mirror the sliding_window coverage.

Refs ag2ai#2793
@obchain obchain requested a review from Lancetnik as a code owner May 11, 2026 07:40
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 11, 2026

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions Bot added the beta label May 11, 2026
@Lancetnik Lancetnik enabled auto-merge May 11, 2026 09:01
@Lancetnik
Copy link
Copy Markdown
Member

@obchain thank you for the contribution! And special thanks for reading and following AI Policy

@Lancetnik Lancetnik added this pull request to the merge queue May 11, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
autogen/beta/policies/_pairing.py 100.00% <100.00%> (ø)
autogen/beta/policies/sliding_window.py 100.00% <100.00%> (ø)
autogen/beta/policies/token_budget.py 96.00% <100.00%> (ø)

... and 56 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Merged via the queue into ag2ai:main with commit 38ce844 May 11, 2026
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SlidingWindowPolicy: incomplete tool call/result pairing causes OpenAI 400 errors

3 participants