fix(beta/policies): pair tool calls/results after history trim by obchain · Pull Request #2800 · ag2ai/ag2

obchain · 2026-05-11T07:40:57Z

Why are these changes needed?

SlidingWindowPolicy and TokenBudgetPolicy only stripped orphan ToolResultsEvents at the head of the trimmed event list. When the matching ToolCallsEvent was trimmed but the ToolResultsEvent survived at a non-leading position — for example a mid-window orphan, or events carried over between agents that share a stream — the assembled messages were sent to providers like OpenAI and rejected with:

Error code: 400 - {'error': {'message': "Invalid parameter: messages with role 'tool' must be a response to a preceeding message with 'tool_calls'.", ...}}

Fix

New helper autogen/beta/policies/_pairing.py::ensure_tool_pairing scans the full trimmed list, collects surviving tool-call IDs from ModelResponse.tool_calls, and drops any ToolResultsEvent with no matching parent. Both SlidingWindowPolicy and TokenBudgetPolicy call it after their respective trimming step, replacing the prior leading-only while isinstance(trimmed[0], ToolResultsEvent) loop.

Commits

Split atomically:

fix(policies): add tool-call pairing helper — helper module only.
fix(sliding_window): drop orphan tool results — wire helper, add tests, Fixes SlidingWindowPolicy: incomplete tool call/result pairing causes OpenAI 400 errors #2793.
fix(token_budget): drop orphan tool results — same bug in sibling policy, wire helper, add test.

Tests

uv run pytest test/beta/policies/ -q → 18 passed (previously 14). New cases:

sliding_window: mid-window orphan dropped; orphans at multiple positions dropped; paired call+result inside window preserved.
token_budget: mid-window orphan dropped.

Existing head-orphan tests continue to pass — behavior is a strict superset of the old leading-only logic.

Related issue number

Fixes #2793

Checks

I've included any doc changes needed (none required — internal helper, no public-API surface change).
I've added tests corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

AI assistance

I understand the changes in this PR and can explain them in my own words.
I have verified that the PR description accurately reflects the actual diff.
If AI assistance was used, I reviewed, tested, and validated the generated code/text before submitting.

Introduce ensure_tool_pairing(events): scan an event list, collect tool-call IDs from surviving ModelResponse.tool_calls, and drop ToolResultsEvents with no matching parent. Required by providers (OpenAI) that reject tool-role messages without a preceding tool_calls message. No caller yet — wired in the following commits. Refs ag2ai#2793

SlidingWindowPolicy only stripped orphan ToolResultsEvents at the head of the window. A ToolResultsEvent whose matching ToolCallsEvent was trimmed but that survived deeper in the window (mid-window orphan, or events carried over between agents on shared streams) passed through and caused providers (OpenAI) to reject the assembled messages with: Invalid parameter: messages with role 'tool' must be a response to a preceeding message with 'tool_calls'. Call ensure_tool_pairing on the trimmed window so orphans at any position are dropped. Add tests for mid-window orphans, multi-position orphans, and the paired-and-kept case. Fixes ag2ai#2793

TokenBudgetPolicy carried the same head-only orphan-stripping logic as SlidingWindowPolicy and shared the same bug: orphan ToolResultsEvents at non-leading positions in the budgeted window survived and triggered provider 400s. Reuse ensure_tool_pairing on the retained event list. Add a mid-window orphan test to mirror the sliding_window coverage. Refs ag2ai#2793

CLAassistant · 2026-05-11T07:41:08Z

All committers have signed the CLA.

Lancetnik · 2026-05-11T09:05:16Z

@obchain thank you for the contribution! And special thanks for reading and following AI Policy

codecov · 2026-05-11T09:11:25Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines	Coverage Δ
autogen/beta/policies/_pairing.py	`100.00% <100.00%> (ø)`
autogen/beta/policies/sliding_window.py	`100.00% <100.00%> (ø)`
autogen/beta/policies/token_budget.py	`96.00% <100.00%> (ø)`

... and 56 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

obchain added 3 commits May 11, 2026 13:09

obchain requested a review from Lancetnik as a code owner May 11, 2026 07:40

github-actions Bot added the beta label May 11, 2026

Lancetnik approved these changes May 11, 2026

View reviewed changes

Lancetnik enabled auto-merge May 11, 2026 09:01

Lancetnik added this pull request to the merge queue May 11, 2026

Merged via the queue into ag2ai:main with commit 38ce844 May 11, 2026
24 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(beta/policies): pair tool calls/results after history trim#2800

fix(beta/policies): pair tool calls/results after history trim#2800
Lancetnik merged 3 commits into
ag2ai:mainfrom
obchain:fix/issue-2793

obchain commented May 11, 2026

Uh oh!

CLAassistant commented May 11, 2026 •

edited

Loading

Uh oh!

Lancetnik commented May 11, 2026

Uh oh!

codecov Bot commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

obchain commented May 11, 2026

Why are these changes needed?

Fix

Commits

Tests

Related issue number

Checks

AI assistance

Uh oh!

CLAassistant commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Lancetnik commented May 11, 2026

Uh oh!

codecov Bot commented May 11, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented May 11, 2026 •

edited

Loading