Skip to content

Parse input_audio message content#47

Merged
tamnd merged 1 commit into
mainfrom
audio-content
Jun 4, 2026
Merged

Parse input_audio message content#47
tamnd merged 1 commit into
mainfrom
audio-content

Conversation

@tamnd

@tamnd tamnd commented Jun 4, 2026

Copy link
Copy Markdown
Owner

What

Audio reaches a chat message the same way images do: as a typed content part.
The OpenAI format carries it inline as base64 under an input_audio part with a
format field, rather than by URL.

  • InputAudio is added to ContentPart.
  • Message.AudioRefs decodes each clip's base64 into bytes paired with its
    format.
  • Message.HasAudio is the cheap check for whether a message carries audio.

Text flattening continues to ignore non-text parts, so audio rides along on the
message for a transcription stage to read instead of being dropped or leaking
into the prompt text.

Scope

This is the api parsing layer, mirroring the earlier image content change. Pure
Go and fully unit tested: base64 decoding with the format preserved, rejection
of a malformed payload, skipping empty or missing audio, the HasAudio check, and
that TextContent still ignores audio parts. Decoding the audio container and
running speech-to-text land later; this makes the audio bytes available instead
of discarding them.

Test

  • go test ./... green.
  • go vet ./... clean.

Audio arrives in a chat message the same way images do: as a typed content
part. OpenAI carries it inline as base64 under an input_audio part with a format
field. Add the InputAudio type to ContentPart and AudioRefs on Message to decode
the clips out, alongside HasAudio for a cheap check.

Text flattening still ignores non-text parts, so audio rides along on the
message for a transcription stage to read rather than being dropped or leaking
into the prompt text. A payload that does not decode is reported as an error.
@tamnd tamnd merged commit 80b4d7a into main Jun 4, 2026
1 check passed
@tamnd tamnd deleted the audio-content branch June 4, 2026 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant