qwen3-speech

qwen3-speech is a library-first speech runtime for Qwen3-TTS and Qwen3-ASR. The core product is the embeddable library and stable C ABI; CLI tools, tests, and future engine bindings are adapters over that surface.

Current Status

As of April 3, 2026, the repository has the intended library structure, public headers, C ABI, CLI adapters, and stage-level fallback implementations for TTS, ASR, audio tokenization, and streaming event surfaces.

ASR has a complete real GGUF inference path under src/asr/ref_impl/ that loads GGUF weights and executes GGML computation graphs. When the manifest points to a valid GGUF artifact, real inference runs; otherwise the pipeline falls back to deterministic stubs for testing.

TTS does not yet have a real GGUF inference path — transformer generation, vocoder decode, speaker encoding, and audio tokenization are deterministic fallback stubs. Real GGUF TTS inference has been validated with the upstream reference project, confirming the integration path is feasible.

What Exists Today

Manifest-first model loading and capability flags in include/qwen3_speech/.
Stage-oriented TTS modules under src/tts/.
Stage-oriented ASR modules under src/asr/.
Real GGUF inference for ASR under src/asr/ref_impl/ (GGUF loading, audio encoder, text decoder, streaming).
Runtime backend selection and GGML backend registration under src/runtime/.
Stable C ABI entry points under src/c_api/.
Thin CLI adapters under cli/.
Unit and integration tests for manifests, common helpers, mel spectrograms, fallback stages, C API smoke, and TTS→ASR roundtrip data flow.

What Does Not Exist Yet

In-tree real GGUF tensor loading and execution for Qwen3-TTS (ASR is implemented).
Checked-in model manifests or bundled converted model artifacts.
Full reference, integration, ABI, and performance suites described by the long-term architecture.
Godot binding implementation beyond the directory scaffold.

Build

cmake -S . -B build -DQWEN3_SPEECH_BUILD_TESTS=ON -DQWEN3_SPEECH_BUILD_CLI=ON
cmake --build build
ctest --test-dir build --output-on-failure

The top-level build vendors ggml/ from the repo by default. Backend and feature toggles are exposed as CMake options; see docs/build-and-test.md.

Repo Layout

include/qwen3_speech/   Public C and thin C++ wrapper headers
src/common/             Shared runtime utilities
src/runtime/            Backend registry, selection, and backend adapters
src/tts/                TTS stage modules and orchestration
src/asr/                ASR stage modules, orchestration, and real GGUF inference
src/asr/ref_impl/       Real GGUF inference path for ASR
src/c_api/              Stable C ABI implementation
cli/                    CLI adapters over the library
tests/                  Unit and smoke tests
docs/                   Project documentation

Documentation

docs/architecture.md: runtime structure, object lifetimes, and stage boundaries
docs/build-and-test.md: build flags, test targets, and CLI usage
docs/model-manifests.md: manifest schema, capabilities, and examples
docs/real-inference-status.md: validated GGUF workflow and current integration gap

Real Inference Boundary

ASR has real in-tree GGUF inference. TTS is still deterministic fallback stubs — the API shape and stage boundaries are in place, but no GGUF loader or graph execution exists for TTS yet. See docs/real-inference-status.md for the full status, validation history, and reproduction steps.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.agents/skills		.agents/skills
cli		cli
cmake		cmake
docs		docs
ggml @ 49f84a9		ggml @ 49f84a9
include/qwen3_speech		include/qwen3_speech
src		src
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CMakeLists.txt		CMakeLists.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

qwen3-speech

Current Status

What Exists Today

What Does Not Exist Yet

Build

Repo Layout

Documentation

Real Inference Boundary

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

qwen3-speech

Current Status

What Exists Today

What Does Not Exist Yet

Build

Repo Layout

Documentation

Real Inference Boundary

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages