Skip to content

feat(asr): add MiMo (Xiaomi) ASR provider#103

Merged
missuo merged 2 commits into
missuo:mainfrom
thedavidweng:feat/mimo-asr
Jun 11, 2026
Merged

feat(asr): add MiMo (Xiaomi) ASR provider#103
missuo merged 2 commits into
missuo:mainfrom
thedavidweng:feat/mimo-asr

Conversation

@thedavidweng

Copy link
Copy Markdown
Contributor

Summary

  • Add MiMo-V2.5-ASR provider support for Xiaomi's speech recognition API
  • Uses OpenAI-compatible chat completions endpoint with base64 audio input and SSE streaming
  • Follows the same architecture as the existing GLM provider (buffer audio → POST on finish → parse SSE stream)

Changes

New file:

  • koe-asr/src/mimo.rs — MiMo ASR provider implementation with 19 unit tests (WAV wrapping, request body construction, response parsing, SSE line parsing, provider lifecycle)

Modified files:

  • koe-asr/src/lib.rs — register mimo module and re-export MimoAsrProvider
  • koe-core/src/config.rs — add MimoAsrConfig struct with url, api_key, model, language fields, defaults, and YAML config entry
  • koe-core/src/lib.rs — add match arm for "mimo" provider selection
  • KoeApp/.../SPSetupWizardWindowController.m — add MiMo to setup wizard (popup item, API key field with secure/plain toggle, show/hide logic, config save/load, test connection method, pane height)

How it works

  1. Audio PCM is buffered during send_audio()
  2. On finish_input(), PCM is wrapped in WAV, base64-encoded, and sent as data:audio/wav;base64,... in a chat completions request
  3. SSE streaming response is parsed for choices[0].delta.content (streaming) or choices[0].message.content (non-streaming)
  4. Authentication via api-key header

Config example

asr:
  provider: "mimo"
  mimo:
    api_key: "your-key"
    model: "mimo-v2.5-asr"
    language: "auto"

Test plan

  • 82/82 Rust tests pass
  • cargo fmt --check clean
  • cargo clippy — no new warnings
  • Manual test: configure MiMo provider in setup wizard
  • Manual test: record audio and verify transcription
  • Manual test: test connection button with valid/invalid API key

thedavidweng and others added 2 commits June 10, 2026 14:23
Add support for Xiaomi MiMo-V2.5-ASR, which uses an OpenAI-compatible
chat completions endpoint with base64 audio input and SSE streaming.

- New provider: koe-asr/src/mimo.rs with 19 unit tests
- Config: MimoAsrConfig with url, api_key, model, language fields
- UI: Setup wizard entry with API key field, test connection button
- Default endpoint: https://api.xiaomimimo.com/v1/chat/completions
Audio and personal data go to Xiaomi's servers when using MiMo ASR.
Make that explicit in the setup wizard: an orange notice appears under
the API key row whenever the MiMo provider is selected, clarifying that
Koe itself collects nothing and runs no servers.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@missuo missuo merged commit c049f7b into missuo:main Jun 11, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants