chore: sync from agents-private by inkeep-oss-sync[bot] · Pull Request #3176 · inkeep/agents

inkeep-oss-sync · 2026-04-22T19:32:10Z

Automated sync from agents-private via Copybara mirror.

* wip: pre-push standardization scaffolding * feat(ci): wire scoped pre-push runner into husky + update AGENTS.md Completes the pre-push standardization started in the prior WIP commit. - .husky/pre-push now runs `pnpm check:monorepo-traps` (whole-repo structural) then `pnpm check:pre-push` (scoped per-subtree). The scoped runner detects which subtrees changed versus `@{upstream}` and only runs fast CI-mirrored checks for them. - AGENTS.md: new "Pre-push verification" and "Pre-commit verification" sections documenting the two-tier hook and the lint-staged routing. * docs: add format cheatsheet entries + pre-push runbook section * fix(ci): tighten check-pre-push error handling and docstring accuracy Address PR #202 review feedback on scripts/check-pre-push.mjs: - Remove migration-lineage and knip from header docstring. Neither is in PUBLIC_AGENTS_STRUCTURAL_CHECKS and both are intentionally excluded (migration-lineage needs clean DB state; knip is ~10-30s and noisy). Documented the omissions so future readers don't wonder. - runScript: surface spawn errors (result.error) and signal termination (result.signal) with their own branches before falling through to the generic exit-status message. New contributors with missing pnpm now see a real diagnostic instead of 'exit null'. - getChangedFiles: same treatment. Distinguishes spawn failure, signal, and non-zero exit so the fallback warning in main() carries useful context. - resolveBaseRef: warn when git spawn fails outright. The silent origin/main fallback still stands for the expected case (no upstream configured), but corrupted repo / permission issues now surface. - Remediation hint at the end: only suggest 'pnpm --dir X format' for subtrees whose format:check actually failed. Structural-check failures (route-handler-patterns, dal-boundary, etc.) aren't fixed by format, and the old blanket suggestion was misleading. * fix(ci): declare lint-staged as root devDependency Address PR #202 review feedback: lint-staged was listed only in public/agents/package.json and not hoisted to root under pnpm's isolated node_modules layout. 'pnpm lint-staged' at repo root then failed with ERR_PNPM_RECURSIVE_EXEC_FIRST_FAIL, which meant the pre-commit hook was a silent no-op. Declaring lint-staged as a root devDependency at the same ^16.1.5 range as public/agents keeps the pre-commit hook functional from the monorepo root. Only the root pnpm-lock.yaml changes; the public/agents lockfile is untouched. Verified: 'pnpm exec lint-staged --version' now resolves at root. * feat(agents): Support more inline text attachment formats (#196) * Support more inline text attachment formats * Reduce allowlist test worker churn * Update OpenAPI snapshot for text attachments * Cleanup * Add .cfg text document support * Capture intermediate text in structured-output generation (#178) * [US-001] Extend generationType literal union to include mixed_generation Adds 'mixed_generation' to the AgentGenerateData.generationType union so downstream consumers can record session events for agent turns that produce both text and data parts. Includes a new test covering the mixed_generation value. Foundation for subsequent stories that compute and emit the new discriminant from the post-stream resolution site. * [US-002] Add write-queue serialization to IncrementalStreamParser Serialize processTextChunk and processObjectDelta via an internal promise-chain writeQueue so that concurrent fullStream and partialOutputStream consumers cannot corrupt shared parser state (collectedParts, buffer, pendingTextBuffer, hasStartedRole, componentAccumulator, allStreamedContent, streamHelper). External method signatures are unchanged. Method bodies are moved to _doProcessTextChunk and _doProcessObjectDelta; the public methods chain work onto writeQueue and catch rejections so a throwing write does not break subsequent enqueued writes. * [US-003] Consume fullStream + partialOutputStream concurrently for structured output When hasStructuredOutput is true, handleStreamGeneration now runs processStreamEvents against fullStream alongside the partialOutputStream consumer via Promise.all. This ensures intermediate-step text-deltas (e.g. 'Let me search...' emitted before a tool call) reach the parser even when the final structured object fails to materialize, fixing the blank-screen failure mode documented in SPEC.md. Adds integration tests in __tests__/stream-handler.test.ts covering: - both streams consumed concurrently when hasStructuredOutput is true - fullStream text captured when partialOutputStream is empty - falsy partial-output deltas filtered - tool-call/tool-result/finish events forward to markToolResult - error events surface via throw - non-structured path does not iterate partialOutputStream - tee-delivery ordering preserved with interleaved events * [US-004] Extend post-stream fallback + mixed_generation discriminant + WARN log * [US-006] Docs: document mixed_generation in agent_generate reference Extend data-operations.mdx to cover the three-value generationType union (text_generation, object_generation, mixed_generation) with a table and a note on parts[] ordering for mixed responses. No changeset — agents-docs is not published via the release-group flow. * chore: add changeset for agents-api minor bump Captures: dual-stream consumption, post-stream fallback, mixed_generation generationType, and structured-output failure WARN log. * chore: downgrade changeset from minor to patch Ships as a bug fix — blank-screen failures in structured-output agents. The mixed_generation generationType value is additive; no breaking change. Patch is the correct semver classification. * docs: describe mixed and text-fallback response shapes for dataComponents data-components.mdx now documents the three response shapes a dataComponents agent can return (object_generation, text_generation, mixed_generation) and the text-fallback behavior when the model fails to produce a valid structured object. status-updates.mdx event-type list updated to reflect that agent_generate also covers mixed text plus structured output. * fixup! local-review: address findings (pass 1) * fixup! local-review: address findings (pass 2) * chore: remove spec + research artifacts from branch These are local ship workflow artifacts — the SPEC.md and research report are worktree-only inputs, not part of the PR deliverable. * style: auto-format with biome * chore: address PR feedback on dual-stream + unknown part kind Document the implicit AI SDK coupling between fullStream and partialOutputStream that makes AbortController cancellation safe, and warn when mapPartsToEventParts hits an unknown part kind instead of silently producing an empty text part. * fix: skip structured-output JSON text-deltas to prevent duplicate text During structured-output generation, fullStream emits text-delta events whose payload is the raw JSON encoding the schema. partialOutputStream concurrently emits parsed object deltas for the same content. Feeding both into the parser produced interleaved/duplicated text inside Text component props (visible in the final dataComponents output). Classify each step by its first non-whitespace text-delta character: if '{' or '[', the step is emitting structured-output JSON — skip its text-deltas and let partialOutputStream drive the parsed components. Otherwise (free-form reasoning like "Let me search..." before a tool call), forward text-deltas so intermediate text still reaches the parser live. Reset classification on finish so multi-step flows are judged per step. * fix: parse fullStream JSON ourselves so multi-step structured output streams In multi-step structured-output generation (e.g., agent emits text, calls tool, then emits final structured JSON), partialOutputStream appears to stop emitting after step 1 completes — its internal JSON accumulator can't cleanly extend across the step boundary. Combined with the previous skip-JSON-text-deltas guard, step 2's content never reached the parser and the wire went dark after the tool call. Replace the skip with active parsing: when a step's first text-delta indicates JSON (starts with { or [), accumulate subsequent text-deltas into a per-step buffer, run parsePartialJson on each update, and feed the repaired cumulative object to parser.processObjectDelta — the same entry point partialOutputStream uses. _doProcessObjectDelta's length-based diffing and lastStreamedComponents tracking dedupe naturally when both sources deliver the same snapshot, so the parallel partialOutputStream consumer stays in place as a cooperating source. The buffer resets on 'finish' so multi-step JSON doesn't concatenate across steps (which would break parsePartialJson). Free-form reasoning text (non-JSON) still flows through processTextChunk unchanged. * fix: smoother structured-output text streaming + reset buffer per step Three fixes layered on the multi-step structured-output streaming flow: 1. Reset the fullStream JSON buffer on 'finish-step', not just 'finish'. AI SDK v6 emits 'finish-step' between steps and 'finish' only at the very end. Without this, step 2's JSON text-deltas append onto step 1's closed JSON, parsePartialJson can't recover, and step 2 never streams. 2. Skip Text dataComponents in IncrementalStreamParser's "component-no-longer-in-accumulator" cleanup loop. When step 2 replaces the dataComponents array via deepMerge, step 1's Text id disappears from the current set and the loop flushes it through streamComponent → writeData('data-component', ...). But Text components were already streamed as text-delta wire events, so that flush produced a spurious data-component chunk duplicating the earlier text. 3. Drop the 50ms delay in IncrementalStreamParser's streamText calls. VercelDataStreamHelper.streamText sleeps delayMs between text-start and text-delta. The writeQueue serializes calls, so during the sleep more fullStream text-deltas accumulate in the JSON buffer and the next parsePartialJson produces a large diff. Result: 50ms becomes the effective chunk granularity (whole paragraphs per wire event). With delayMs=0, diffs flow at the rate parsePartialJson produces them, which is per-fullStream-text-delta granularity — much smoother. * fix: insert \n\n between consecutive Text dataComponents When a dataComponents array contains multiple Text components back-to-back (with or without other components like citations interspersed), their props.text gets streamed as text-delta wire events with no separator. In Markdown rendering, adjacent paragraphs without a blank line between them collapse into a single run-on block. Emit '\n\n' before the first streamed chunk of any Text component whose id differs from the previously streamed Text id. The separator fires only on the initial streaming of a new Text id, so incremental updates to the same Text id (typewriter streaming) still flow without breaks. * refactor: extract streaming helpers and document the structured-output pipeline Consolidates the fixes that enable token-level streaming of Anthropic structured output, and extracts inline logic into well-named helpers so the "what" reads as a sequence and the "why" lives in docblocks. stream-handler.ts: - Docblocks on handleStreamGeneration (dual source rationale, abort wiring) and processStreamEvents (per-step classification, JSON buffer reset on finish-step). - Extract classifyStepMode, accumulateAndEmitJsonDelta, consumePartialOutputStream, normalizeStreamError. IncrementalStreamParser.ts: - Class-level docblock tracing the full pipeline: jsonTool provider option, fullStream + partialOutputStream consumption, Text vs data-component wire mapping, paragraph separator, positional non-Text gating. - Module-level isTextComponent type guard — one place for the Text/non-Text routing decision. - Extract flushEvictedComponents, streamTextComponentDelta, emitTextToClient, rememberSnapshot from _doProcessObjectDelta. The method body now reads as two clearly-labelled steps (flush evicted, then walk current array) instead of two nested loops. stream-helpers.ts: - Tighten the VercelDataStreamHelper.streamText comment to explain granularity (not pacing) is what makes rendering feel smooth. generate.ts: - Tighten the providerOptions.anthropic.structuredOutputMode comment to cite the specific models affected (Sonnet 4.5, Opus 4.5, Opus 4.1) and the external context (vercel/ai#9195, the provider source file). * style: auto-format with biome * fix: make preludeEqualsOutput key-order independent Comparing JSON.stringify(parsed) === JSON.stringify(output) is order-dependent — two objects with identical content but different key order would compare unequal and produce a duplicated prelude in the rendered structured-output response. Canonicalise with a recursive key-sort before stringifying so the comparison is purely structural. * fix: enable jsonTool mode for Anthropic structured output streaming Set providerOptions.anthropic.structuredOutputMode = 'jsonTool' on streamText calls that use Output.object(). Without this, Claude Sonnet 4.5 / Opus 4.5 / Opus 4.1 buffer the final structured JSON server-side and return it as a single giant text-delta event after 20+ seconds of silence, because the default path routes through Anthropic's native structured-outputs beta and Vercel AI SDK's createOutputTransformStream gates publishing on parsePartialJson producing a new valid partial (which it can't for deeply nested schemas until the tail). jsonTool forces the synthetic-tool fallback path which streams tokens as input_json_delta events and bypasses the transform gate entirely. Known tradeoff, accepted: tool_choice: required (auto-set in jsonTool mode) prefills the assistant turn, so Claude does NOT emit pre-tool-call reasoning text. This is documented Anthropic API behaviour, not an SDK bug. Existing data-operation events (tool_call, tool_result) still surface tool activity to the UI, so users see the agent is working. Verified neither @ai-sdk/anthropic upgrade (3.0.7 → 3.0.71) nor ai upgrade (6.0.14 → 6.0.168) resolves the buffering — the transform-gate has not been fixed upstream as of this commit. Community tracking: vercel/ai#3422, #12427, #12298, #7220, #9351. See the extensive comment in generate.ts for full rationale, the alternatives considered, and references to the relevant source lines and Anthropic docs. --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Version Packages (agents) (#205) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix(ci): encode repository_dispatch client_payload as JSON object v0.70.0 stranded because both the success and failure notify steps in inkeep/agents' release.yml sent client_payload as a stringified JSON via gh api --raw-field. The dispatches endpoint rejects that with HTTP 422 "is not an object", npm had already published, but the reverse-sync was lost so no GitHub Release, no Vercel prod deploy, no tracking issue. Switch both steps to build the request body with jq and pipe through gh api --input -, which sends client_payload as a real JSON object. Runbook gains a new entry documenting the 422 symptom, the gh api flag encoding pitfall, and the manual -F 'client_payload[key]=val' recovery command for any future stranding. --------- Co-authored-by: mike-inkeep <mike.r@inkeep.com> Co-authored-by: tim-inkeep <132074086+tim-inkeep@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: inkeep-internal-ci[bot] <259778081+inkeep-internal-ci[bot]@users.noreply.github.com> GitOrigin-RevId: 6061f7757827aca72ac7dffd87a0fe07ea68b352

changeset-bot · 2026-04-22T19:32:14Z

⚠️ No Changeset found

Latest commit: 8ace2eb

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

inkeep-internal-ci

Automated approval from agents-private public-mirror-sync (run: https://github.com/inkeep/agents-private/actions/runs/24798411055). Source of truth is the monorepo; direct edits on inkeep/agents are overwritten on next sync.

inkeep-oss-sync Bot enabled auto-merge April 22, 2026 19:32

inkeep-internal-ci Bot approved these changes Apr 22, 2026

View reviewed changes

inkeep-oss-sync Bot added this pull request to the merge queue Apr 22, 2026

Merged via the queue into main with commit baff971 Apr 22, 2026
14 checks passed

inkeep-oss-sync Bot deleted the copybara/sync branch April 22, 2026 19:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: sync from agents-private#3176

chore: sync from agents-private#3176
inkeep-oss-sync[bot] merged 1 commit into
mainfrom
copybara/sync

inkeep-oss-sync Bot commented Apr 22, 2026

Uh oh!

changeset-bot Bot commented Apr 22, 2026

Uh oh!

inkeep-internal-ci Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

inkeep-oss-sync Bot commented Apr 22, 2026

Uh oh!

changeset-bot Bot commented Apr 22, 2026

⚠️ No Changeset found

Uh oh!

inkeep-internal-ci Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant