Skip to content

ISS-82: Add Pre-Built Message History To Runner#86

Merged
jrswab merged 2 commits into
masterfrom
ISS-82/pre-built-message-history
May 7, 2026
Merged

ISS-82: Add Pre-Built Message History To Runner#86
jrswab merged 2 commits into
masterfrom
ISS-82/pre-built-message-history

Conversation

@jrswab

@jrswab jrswab commented May 7, 2026

Copy link
Copy Markdown
Owner

Summary

This update adds pre-built message history support to the runner, enabling callers to provide a complete conversation sequence directly via Options.Messages instead of relying on stdin/prompt input. It includes comprehensive message validation, bidirectional conversion between internal and provider message formats, and updated dry-run/result output to reflect pre-built history. Additionally, test fixtures and golden files are updated to use the OpenRouter GPT-4o mini model as the default, and CI/CD is configured to provide the necessary API key.

Changelog

Added

  • Pre-built message history support through new Options.Messages field
  • Message types: Message, ToolCall, and ToolResult structs for representing conversation history
  • Message validation function (validateMessages) enforcing allowed roles and role-specific constraints
  • Bidirectional conversion functions between runner and provider message formats
  • Result messages output field for including conversation history in responses
  • Comprehensive test coverage for message validation, conversions, and pre-built history workflows including streaming, tool loops, and error cases

Changed

  • Run() now treats pre-built messages as first-class input, bypassing stdin/prompt resolution
  • Dry-run output updated to display pre-built message history details instead of computed user message
  • Memory append behavior changed to use the first user message content from pre-built history
  • Conversation loop updated to immediately append assistant messages with tool calls after LLM response
  • Test fixture model configurations across cmd/testdata/agents/ updated from openai/gpt-4o or anthropic/claude-sonnet-4-20250514 to openrouter/openai/gpt-4o-mini
  • Golden test output files updated to reflect new OpenRouter model identifier
  • GitHub Actions test job now sets OPENROUTER_API_KEY environment variable from repository secrets

- Add public Message, ToolCall, ToolResult types in pkg/runner
- Add validateMessages() with role, type-consistency, and call-ID checks
- Add toProviderMessages()/fromProviderMessage() conversion helpers
- Add Messages to runner.Options (skip Prompt/Stdin when set)
- Add Messages to runner.Result, populate from req.Messages after run
- Add MessageCount to DryRunInfo, update dry-run display
- Update MarshalJSON to include messages array with empty [] defaults
- Update memory append to use first user message from pre-built history
- Add firstUserMessageContent() helper for memory task text
- Fix conversation loop to append assistant messages before tool results
- Add comprehensive tests: single-shot, tool loop, streaming, validation,
  dry-run, memory append, empty slice fallback, JSON serialization
@coderabbitai

coderabbitai Bot commented May 7, 2026

Copy link
Copy Markdown

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: aa55c302-3b54-45a5-8c4b-92d65585690a

📥 Commits

Reviewing files that changed from the base of the PR and between 8c2dd0d and 96ec8a4.

📒 Files selected for processing (17)
  • .github/workflows/go.yml
  • cmd/fixture_test.go
  • cmd/golden_test.go
  • cmd/smoke_test.go
  • cmd/testdata/agents/basic.toml
  • cmd/testdata/agents/with_files.toml
  • cmd/testdata/agents/with_memory.toml
  • cmd/testdata/agents/with_skill.toml
  • cmd/testdata/agents/with_subagents.toml
  • cmd/testdata/agents/with_tools.toml
  • cmd/testdata/golden/dry-run/basic.txt
  • cmd/testdata/golden/dry-run/with_files.txt
  • cmd/testdata/golden/dry-run/with_memory.txt
  • cmd/testdata/golden/dry-run/with_skill.txt
  • cmd/testdata/golden/dry-run/with_subagents.txt
  • cmd/testdata/golden/dry-run/with_tools.txt
  • cmd/testdata/golden/json/with_subagents.json
✅ Files skipped from review due to trivial changes (13)
  • cmd/testdata/golden/dry-run/with_skill.txt
  • cmd/testdata/golden/dry-run/with_subagents.txt
  • cmd/testdata/golden/dry-run/basic.txt
  • cmd/testdata/agents/with_tools.toml
  • cmd/testdata/golden/dry-run/with_files.txt
  • cmd/testdata/agents/with_subagents.toml
  • cmd/testdata/agents/basic.toml
  • cmd/testdata/agents/with_files.toml
  • cmd/testdata/agents/with_skill.toml
  • cmd/testdata/agents/with_memory.toml
  • cmd/testdata/golden/dry-run/with_memory.txt
  • cmd/testdata/golden/dry-run/with_tools.txt
  • cmd/testdata/golden/json/with_subagents.json

📝 Walkthrough

Walkthrough

This PR enables agents to accept pre-built message history via Options.Messages. It adds Message/ToolCall/ToolResult types, validation, converters to/from provider messages, run-loop integration (skipping stdin/prompt when provided), dry-run/result serialization for messages, and tests covering conversions, validation, run behavior, and fixtures.

Changes

Pre-built Message History

Layer / File(s) Summary
Message Types & Validation
pkg/runner/messages.go
Introduces Message, ToolCall, and ToolResult types with validateMessages enforcing role constraints and tool call/result associations.
Message Conversion
pkg/runner/messages.go
Bidirectional converters (toProviderMessages, fromProviderMessages) map between runner and provider message formats, including tool calls/results with argument deep-copy.
Input & Output Types
pkg/runner/options.go, pkg/runner/result.go
Adds Options.Messages for pre-built history input, Result.Messages for output, and DryRunInfo.MessageCount for summary.
Core Run Logic
pkg/runner/run.go
Validates pre-built messages early, conditionally skips stdin/prompt resolution, builds provider request from history, populates dry-run info with message count, appends assistant responses, and populates result messages. Memory append uses first user message content from history.
Result Serialization
pkg/runner/result.go
JSON marshalling for Result.Messages with serializeMessages helper guaranteeing non-null tool_calls and tool_results arrays.
Dry-run Display
cmd/run.go
printDryRun displays message history count instead of computed user message when MessageCount > 0.
Message Type Tests
pkg/runner/messages_test.go
Unit tests for message construction, type field access, conversion round-trips, and validation logic covering allowed roles and violations.
Result Serialization Tests
pkg/runner/result_test.go
Tests verify JSON marshalling of Result.Messages with correct field structure and presence of tool call/result objects.
Integration Tests
pkg/runner/run_test.go
End-to-end tests for streaming, single-shot, tool loops, validation errors, empty-slice fallback, dry-run, and memory append behavior with pre-built messages.
CI & Fixtures
.github/workflows/go.yml, cmd/*, cmd/testdata/*, cmd/testdata/golden/*
Workflow env and numerous fixture/golden/smoke-test updates to use OpenRouter/OpenAI-compatible model identifiers and OPENROUTER_API_KEY.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

💬 Messages now build upon the past,
No prompts needed—let conversations last.
Tool loops dance through pre-built history's thread,
While validation ensures no steps are misled. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 23.91% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and accurately summarizes the main feature: adding support for pre-built message history to the runner component.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ISS-82/pre-built-message-history

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/runner/messages_test.go`:
- Around line 11-151: Refactor the individual tests in messages_test.go
(TestOptionsMessagesField, TestResultMessagesField, TestDryRunInfoMessageCount,
TestMessageTypeCanBeConstructed, TestToProviderMessages,
TestFromProviderMessage, TestRoundTripConversion and the remaining cases
mentioned) into table-driven tests using slices of structs with fields like
name, input, want (and any helper fields for expected ToolCalls/ToolResults),
then iterate the table with t.Run to assert expectations; keep the same
assertions (checking Options.Messages, Result.Messages, DryRunInfo.MessageCount,
Message fields, toProviderMessages/fromProviderMessage round-trips and
ToolCalls/ToolResults) but centralized in looped subtests to follow the
[]struct{name,input,want} pattern mandated by the testing guidelines.

In `@pkg/runner/messages.go`:
- Around line 11-15: Remove the package-level mutable map validRoles and replace
it with a side-effect-free helper predicate (e.g., an unexported
isValidRole(role string) bool) that uses a local switch or an immutable literal
lookup to validate roles; update all places that reference validRoles to call
isValidRole instead so you eliminate global mutable state while preserving the
same checks.

In `@pkg/runner/result_test.go`:
- Around line 63-164: The two one-off tests TestResultMarshalJSON_WithMessages
and TestResultMarshalJSON_WithMessagesAndToolCalls should be merged into a
single table-driven test (e.g., TestResultMarshalJSON) using a []struct{name
string, input Result, validate func(parsed map[string]interface{}) error} (or a
want struct describing expected message counts and fields) and iterating over
cases with t.Run(tc.name, func(t *testing.T){...}); locate the existing Result
instances and JSON marshal/unmarshal/assert logic in those functions and move
them into table entries, then run the same marshal/unmarshal flow and call the
per-case validate or compare fields (messages, tool_calls, tool_results, ids,
contents, is_error) to assert expected values.

In `@pkg/runner/run_test.go`:
- Around line 292-338: Refactor the standalone pre-built message tests
(including TestRun_PreBuiltMessages_Streaming and the other cases in the 589-957
range) into a table-driven test using a []struct{name string, opts Options,
mockResponses []testutil.MockLLMResponse, wantContent string, wantInputTokens
int, wantOutputTokens int, wantStdoutContains string} (or similar) and iterate
with for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { ... }) };
inside each subtest set up the mock server using tt.mockResponses, call Run with
tt.opts, assert result fields against tt.want* and assert stdout contains
tt.wantStdoutContains when applicable (preserve the existing streaming test
behavior), and extract repeated setup code (setupTestEnv, newMockStdoutStderr,
env vars) into small helpers to keep each table entry concise.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7e400f4a-a89e-4965-b4a1-af785445f701

📥 Commits

Reviewing files that changed from the base of the PR and between 207449b and 8c2dd0d.

⛔ Files ignored due to path filters (2)
  • docs/plans/051_runner_message_history_implement.md is excluded by !docs/**
  • docs/plans/051_runner_message_history_spec.md is excluded by !docs/**
📒 Files selected for processing (8)
  • cmd/run.go
  • pkg/runner/messages.go
  • pkg/runner/messages_test.go
  • pkg/runner/options.go
  • pkg/runner/result.go
  • pkg/runner/result_test.go
  • pkg/runner/run.go
  • pkg/runner/run_test.go

Comment thread pkg/runner/messages_test.go
Comment thread pkg/runner/messages.go
Comment thread pkg/runner/result_test.go
Comment thread pkg/runner/run_test.go
Updated all test fixtures to use openrouter/openai/gpt-4o-mini model.
Updated smoke/golden/fixture tests to use OPENROUTER_API_KEY instead
of OPENAI_API_KEY / ANTHROPIC_API_KEY.
Added OPENROUTER_API_KEY env var to GitHub Actions test step.
Regenerated golden files to match the new model.
@jrswab jrswab merged commit f199404 into master May 7, 2026
13 of 14 checks passed
@jrswab jrswab deleted the ISS-82/pre-built-message-history branch May 7, 2026 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant