fix(beta/policies): pair tool calls/results after history trim#2800
Merged
Conversation
Introduce ensure_tool_pairing(events): scan an event list, collect tool-call IDs from surviving ModelResponse.tool_calls, and drop ToolResultsEvents with no matching parent. Required by providers (OpenAI) that reject tool-role messages without a preceding tool_calls message. No caller yet — wired in the following commits. Refs ag2ai#2793
SlidingWindowPolicy only stripped orphan ToolResultsEvents at the head of the
window. A ToolResultsEvent whose matching ToolCallsEvent was trimmed but that
survived deeper in the window (mid-window orphan, or events carried over
between agents on shared streams) passed through and caused providers (OpenAI)
to reject the assembled messages with:
Invalid parameter: messages with role 'tool' must be a response
to a preceeding message with 'tool_calls'.
Call ensure_tool_pairing on the trimmed window so orphans at any position
are dropped. Add tests for mid-window orphans, multi-position orphans, and
the paired-and-kept case.
Fixes ag2ai#2793
TokenBudgetPolicy carried the same head-only orphan-stripping logic as SlidingWindowPolicy and shared the same bug: orphan ToolResultsEvents at non-leading positions in the budgeted window survived and triggered provider 400s. Reuse ensure_tool_pairing on the retained event list. Add a mid-window orphan test to mirror the sliding_window coverage. Refs ag2ai#2793
Lancetnik
approved these changes
May 11, 2026
Member
|
@obchain thank you for the contribution! And special thanks for reading and following AI Policy |
Codecov Report✅ All modified and coverable lines are covered by tests.
... and 56 files with indirect coverage changes 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why are these changes needed?
SlidingWindowPolicyandTokenBudgetPolicyonly stripped orphanToolResultsEvents at the head of the trimmed event list. When the matchingToolCallsEventwas trimmed but theToolResultsEventsurvived at a non-leading position — for example a mid-window orphan, or events carried over between agents that share a stream — the assembled messages were sent to providers like OpenAI and rejected with:Fix
New helper
autogen/beta/policies/_pairing.py::ensure_tool_pairingscans the full trimmed list, collects surviving tool-call IDs fromModelResponse.tool_calls, and drops anyToolResultsEventwith no matching parent. BothSlidingWindowPolicyandTokenBudgetPolicycall it after their respective trimming step, replacing the prior leading-onlywhile isinstance(trimmed[0], ToolResultsEvent)loop.Commits
Split atomically:
fix(policies): add tool-call pairing helper— helper module only.fix(sliding_window): drop orphan tool results— wire helper, add tests, Fixes SlidingWindowPolicy: incomplete tool call/result pairing causes OpenAI 400 errors #2793.fix(token_budget): drop orphan tool results— same bug in sibling policy, wire helper, add test.Tests
uv run pytest test/beta/policies/ -q→ 18 passed (previously 14). New cases:sliding_window: mid-window orphan dropped; orphans at multiple positions dropped; paired call+result inside window preserved.token_budget: mid-window orphan dropped.Existing head-orphan tests continue to pass — behavior is a strict superset of the old leading-only logic.
Related issue number
Fixes #2793
Checks
AI assistance