Skip to content

chore: sync from agents-private#3176

Merged
inkeep-oss-sync[bot] merged 1 commit into
mainfrom
copybara/sync
Apr 22, 2026
Merged

chore: sync from agents-private#3176
inkeep-oss-sync[bot] merged 1 commit into
mainfrom
copybara/sync

Conversation

@inkeep-oss-sync

Copy link
Copy Markdown
Contributor

Automated sync from agents-private via Copybara mirror.

* wip: pre-push standardization scaffolding

* feat(ci): wire scoped pre-push runner into husky + update AGENTS.md

Completes the pre-push standardization started in the prior WIP commit.

- .husky/pre-push now runs `pnpm check:monorepo-traps` (whole-repo structural)
  then `pnpm check:pre-push` (scoped per-subtree). The scoped runner
  detects which subtrees changed versus `@{upstream}` and only runs fast
  CI-mirrored checks for them.
- AGENTS.md: new "Pre-push verification" and "Pre-commit verification"
  sections documenting the two-tier hook and the lint-staged routing.

* docs: add format cheatsheet entries + pre-push runbook section

* fix(ci): tighten check-pre-push error handling and docstring accuracy

Address PR #202 review feedback on scripts/check-pre-push.mjs:

- Remove migration-lineage and knip from header docstring. Neither is in
  PUBLIC_AGENTS_STRUCTURAL_CHECKS and both are intentionally excluded
  (migration-lineage needs clean DB state; knip is ~10-30s and noisy).
  Documented the omissions so future readers don't wonder.
- runScript: surface spawn errors (result.error) and signal termination
  (result.signal) with their own branches before falling through to the
  generic exit-status message. New contributors with missing pnpm now
  see a real diagnostic instead of 'exit null'.
- getChangedFiles: same treatment. Distinguishes spawn failure, signal,
  and non-zero exit so the fallback warning in main() carries useful
  context.
- resolveBaseRef: warn when git spawn fails outright. The silent
  origin/main fallback still stands for the expected case (no upstream
  configured), but corrupted repo / permission issues now surface.
- Remediation hint at the end: only suggest 'pnpm --dir X format' for
  subtrees whose format:check actually failed. Structural-check
  failures (route-handler-patterns, dal-boundary, etc.) aren't fixed
  by format, and the old blanket suggestion was misleading.

* fix(ci): declare lint-staged as root devDependency

Address PR #202 review feedback: lint-staged was listed only in
public/agents/package.json and not hoisted to root under pnpm's
isolated node_modules layout. 'pnpm lint-staged' at repo root then
failed with ERR_PNPM_RECURSIVE_EXEC_FIRST_FAIL, which meant the
pre-commit hook was a silent no-op.

Declaring lint-staged as a root devDependency at the same ^16.1.5
range as public/agents keeps the pre-commit hook functional from
the monorepo root. Only the root pnpm-lock.yaml changes; the
public/agents lockfile is untouched.

Verified: 'pnpm exec lint-staged --version' now resolves at root.

* feat(agents): Support more inline text attachment formats (#196)

* Support more inline text attachment formats

* Reduce allowlist test worker churn

* Update OpenAPI snapshot for text attachments

* Cleanup

* Add .cfg text document support

* Capture intermediate text in structured-output generation (#178)

* [US-001] Extend generationType literal union to include mixed_generation

Adds 'mixed_generation' to the AgentGenerateData.generationType union so
downstream consumers can record session events for agent turns that
produce both text and data parts. Includes a new test covering the
mixed_generation value.

Foundation for subsequent stories that compute and emit the new
discriminant from the post-stream resolution site.

* [US-002] Add write-queue serialization to IncrementalStreamParser

Serialize processTextChunk and processObjectDelta via an internal
promise-chain writeQueue so that concurrent fullStream and
partialOutputStream consumers cannot corrupt shared parser state
(collectedParts, buffer, pendingTextBuffer, hasStartedRole,
componentAccumulator, allStreamedContent, streamHelper).

External method signatures are unchanged. Method bodies are moved to
_doProcessTextChunk and _doProcessObjectDelta; the public methods chain
work onto writeQueue and catch rejections so a throwing write does not
break subsequent enqueued writes.

* [US-003] Consume fullStream + partialOutputStream concurrently for structured output

When hasStructuredOutput is true, handleStreamGeneration now runs
processStreamEvents against fullStream alongside the partialOutputStream
consumer via Promise.all. This ensures intermediate-step text-deltas
(e.g. 'Let me search...' emitted before a tool call) reach the parser
even when the final structured object fails to materialize, fixing the
blank-screen failure mode documented in SPEC.md.

Adds integration tests in __tests__/stream-handler.test.ts covering:
- both streams consumed concurrently when hasStructuredOutput is true
- fullStream text captured when partialOutputStream is empty
- falsy partial-output deltas filtered
- tool-call/tool-result/finish events forward to markToolResult
- error events surface via throw
- non-structured path does not iterate partialOutputStream
- tee-delivery ordering preserved with interleaved events

* [US-004] Extend post-stream fallback + mixed_generation discriminant + WARN log

* [US-006] Docs: document mixed_generation in agent_generate reference

Extend data-operations.mdx to cover the three-value generationType union
(text_generation, object_generation, mixed_generation) with a table and a
note on parts[] ordering for mixed responses. No changeset — agents-docs
is not published via the release-group flow.

* chore: add changeset for agents-api minor bump

Captures: dual-stream consumption, post-stream fallback, mixed_generation
generationType, and structured-output failure WARN log.

* chore: downgrade changeset from minor to patch

Ships as a bug fix — blank-screen failures in structured-output
agents. The mixed_generation generationType value is additive; no
breaking change. Patch is the correct semver classification.

* docs: describe mixed and text-fallback response shapes for dataComponents

data-components.mdx now documents the three response shapes a dataComponents
agent can return (object_generation, text_generation, mixed_generation) and
the text-fallback behavior when the model fails to produce a valid structured
object. status-updates.mdx event-type list updated to reflect that
agent_generate also covers mixed text plus structured output.

* fixup! local-review: address findings (pass 1)

* fixup! local-review: address findings (pass 2)

* chore: remove spec + research artifacts from branch

These are local ship workflow artifacts — the SPEC.md and research
report are worktree-only inputs, not part of the PR deliverable.

* style: auto-format with biome

* chore: address PR feedback on dual-stream + unknown part kind

Document the implicit AI SDK coupling between fullStream and
partialOutputStream that makes AbortController cancellation safe, and
warn when mapPartsToEventParts hits an unknown part kind instead of
silently producing an empty text part.

* fix: skip structured-output JSON text-deltas to prevent duplicate text

During structured-output generation, fullStream emits text-delta events
whose payload is the raw JSON encoding the schema. partialOutputStream
concurrently emits parsed object deltas for the same content. Feeding
both into the parser produced interleaved/duplicated text inside Text
component props (visible in the final dataComponents output).

Classify each step by its first non-whitespace text-delta character: if
'{' or '[', the step is emitting structured-output JSON — skip its
text-deltas and let partialOutputStream drive the parsed components.
Otherwise (free-form reasoning like "Let me search..." before a tool
call), forward text-deltas so intermediate text still reaches the
parser live. Reset classification on finish so multi-step flows are
judged per step.

* fix: parse fullStream JSON ourselves so multi-step structured output streams

In multi-step structured-output generation (e.g., agent emits text, calls
tool, then emits final structured JSON), partialOutputStream appears to
stop emitting after step 1 completes — its internal JSON accumulator
can't cleanly extend across the step boundary. Combined with the previous
skip-JSON-text-deltas guard, step 2's content never reached the parser
and the wire went dark after the tool call.

Replace the skip with active parsing: when a step's first text-delta
indicates JSON (starts with { or [), accumulate subsequent text-deltas
into a per-step buffer, run parsePartialJson on each update, and feed
the repaired cumulative object to parser.processObjectDelta — the same
entry point partialOutputStream uses. _doProcessObjectDelta's
length-based diffing and lastStreamedComponents tracking dedupe
naturally when both sources deliver the same snapshot, so the parallel
partialOutputStream consumer stays in place as a cooperating source.

The buffer resets on 'finish' so multi-step JSON doesn't concatenate
across steps (which would break parsePartialJson). Free-form reasoning
text (non-JSON) still flows through processTextChunk unchanged.

* fix: smoother structured-output text streaming + reset buffer per step

Three fixes layered on the multi-step structured-output streaming flow:

1. Reset the fullStream JSON buffer on 'finish-step', not just 'finish'.
   AI SDK v6 emits 'finish-step' between steps and 'finish' only at the
   very end. Without this, step 2's JSON text-deltas append onto step 1's
   closed JSON, parsePartialJson can't recover, and step 2 never streams.

2. Skip Text dataComponents in IncrementalStreamParser's
   "component-no-longer-in-accumulator" cleanup loop. When step 2 replaces
   the dataComponents array via deepMerge, step 1's Text id disappears
   from the current set and the loop flushes it through
   streamComponent → writeData('data-component', ...). But Text components
   were already streamed as text-delta wire events, so that flush produced
   a spurious data-component chunk duplicating the earlier text.

3. Drop the 50ms delay in IncrementalStreamParser's streamText calls.
   VercelDataStreamHelper.streamText sleeps delayMs between text-start
   and text-delta. The writeQueue serializes calls, so during the sleep
   more fullStream text-deltas accumulate in the JSON buffer and the next
   parsePartialJson produces a large diff. Result: 50ms becomes the
   effective chunk granularity (whole paragraphs per wire event). With
   delayMs=0, diffs flow at the rate parsePartialJson produces them,
   which is per-fullStream-text-delta granularity — much smoother.

* fix: insert \n\n between consecutive Text dataComponents

When a dataComponents array contains multiple Text components back-to-back
(with or without other components like citations interspersed), their
props.text gets streamed as text-delta wire events with no separator.
In Markdown rendering, adjacent paragraphs without a blank line between
them collapse into a single run-on block.

Emit '\n\n' before the first streamed chunk of any Text component whose
id differs from the previously streamed Text id. The separator fires only
on the initial streaming of a new Text id, so incremental updates to the
same Text id (typewriter streaming) still flow without breaks.

* refactor: extract streaming helpers and document the structured-output pipeline

Consolidates the fixes that enable token-level streaming of Anthropic
structured output, and extracts inline logic into well-named helpers so
the "what" reads as a sequence and the "why" lives in docblocks.

stream-handler.ts:
 - Docblocks on handleStreamGeneration (dual source rationale, abort
   wiring) and processStreamEvents (per-step classification, JSON buffer
   reset on finish-step).
 - Extract classifyStepMode, accumulateAndEmitJsonDelta,
   consumePartialOutputStream, normalizeStreamError.

IncrementalStreamParser.ts:
 - Class-level docblock tracing the full pipeline: jsonTool provider
   option, fullStream + partialOutputStream consumption, Text vs
   data-component wire mapping, paragraph separator, positional
   non-Text gating.
 - Module-level isTextComponent type guard — one place for the
   Text/non-Text routing decision.
 - Extract flushEvictedComponents, streamTextComponentDelta,
   emitTextToClient, rememberSnapshot from _doProcessObjectDelta. The
   method body now reads as two clearly-labelled steps (flush evicted,
   then walk current array) instead of two nested loops.

stream-helpers.ts:
 - Tighten the VercelDataStreamHelper.streamText comment to explain
   granularity (not pacing) is what makes rendering feel smooth.

generate.ts:
 - Tighten the providerOptions.anthropic.structuredOutputMode comment to
   cite the specific models affected (Sonnet 4.5, Opus 4.5, Opus 4.1)
   and the external context (vercel/ai#9195, the provider source file).

* style: auto-format with biome

* fix: make preludeEqualsOutput key-order independent

Comparing JSON.stringify(parsed) === JSON.stringify(output) is
order-dependent — two objects with identical content but different key
order would compare unequal and produce a duplicated prelude in the
rendered structured-output response. Canonicalise with a recursive
key-sort before stringifying so the comparison is purely structural.

* fix: enable jsonTool mode for Anthropic structured output streaming

Set providerOptions.anthropic.structuredOutputMode = 'jsonTool' on
streamText calls that use Output.object(). Without this, Claude Sonnet
4.5 / Opus 4.5 / Opus 4.1 buffer the final structured JSON server-side
and return it as a single giant text-delta event after 20+ seconds of
silence, because the default path routes through Anthropic's native
structured-outputs beta and Vercel AI SDK's createOutputTransformStream
gates publishing on parsePartialJson producing a new valid partial
(which it can't for deeply nested schemas until the tail).

jsonTool forces the synthetic-tool fallback path which streams tokens
as input_json_delta events and bypasses the transform gate entirely.

Known tradeoff, accepted: tool_choice: required (auto-set in jsonTool
mode) prefills the assistant turn, so Claude does NOT emit pre-tool-call
reasoning text. This is documented Anthropic API behaviour, not an SDK
bug. Existing data-operation events (tool_call, tool_result) still
surface tool activity to the UI, so users see the agent is working.

Verified neither @ai-sdk/anthropic upgrade (3.0.7 → 3.0.71) nor ai
upgrade (6.0.14 → 6.0.168) resolves the buffering — the transform-gate
has not been fixed upstream as of this commit. Community tracking:
vercel/ai#3422, #12427, #12298, #7220, #9351.

See the extensive comment in generate.ts for full rationale, the
alternatives considered, and references to the relevant source lines
and Anthropic docs.

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Version Packages (agents) (#205)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix(ci): encode repository_dispatch client_payload as JSON object

v0.70.0 stranded because both the success and failure notify steps in
inkeep/agents' release.yml sent client_payload as a stringified JSON via
gh api --raw-field. The dispatches endpoint rejects that with HTTP 422
"is not an object", npm had already published, but the reverse-sync was
lost so no GitHub Release, no Vercel prod deploy, no tracking issue.

Switch both steps to build the request body with jq and pipe through
gh api --input -, which sends client_payload as a real JSON object.

Runbook gains a new entry documenting the 422 symptom, the gh api flag
encoding pitfall, and the manual -F 'client_payload[key]=val' recovery
command for any future stranding.

---------

Co-authored-by: mike-inkeep <mike.r@inkeep.com>
Co-authored-by: tim-inkeep <132074086+tim-inkeep@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: inkeep-internal-ci[bot] <259778081+inkeep-internal-ci[bot]@users.noreply.github.com>
GitOrigin-RevId: 6061f7757827aca72ac7dffd87a0fe07ea68b352
@inkeep-oss-sync inkeep-oss-sync Bot enabled auto-merge April 22, 2026 19:32
@changeset-bot

changeset-bot Bot commented Apr 22, 2026

Copy link
Copy Markdown

⚠️ No Changeset found

Latest commit: 8ace2eb

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@inkeep-internal-ci inkeep-internal-ci Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated approval from agents-private public-mirror-sync (run: https://github.com/inkeep/agents-private/actions/runs/24798411055). Source of truth is the monorepo; direct edits on inkeep/agents are overwritten on next sync.

@inkeep-oss-sync inkeep-oss-sync Bot added this pull request to the merge queue Apr 22, 2026
Merged via the queue into main with commit baff971 Apr 22, 2026
14 checks passed
@inkeep-oss-sync inkeep-oss-sync Bot deleted the copybara/sync branch April 22, 2026 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant