v0.4.3

@Blaizzy

What's Changed

Add LongCat-AudioDiT 1B TTS model by @Blaizzy in #627
feat: add WebM audio format support by @regcs in #635
Add MkDocs docs site and docs guardrails by @shreyaskarnik in #626
Update branch for GitHub Actions workflow by @Blaizzy in #639
feat: add MeloTTS-English MLX port by @shreyaskarnik in #629
feat: add OmniVoice zero-shot multilingual TTS (646+ languages) by @beshkenadze in #630
Register client disconnects while streaming TTS audio. by @orbitalquark in #634
fix(kokoro): support quantized checkpoint layout and guard NaN durations by @beshkenadze in #624
Remove docs check for user-facing changes by @Blaizzy in #658
fix(stt): correct granite_speech Conv1d weight sanitization and add parakeet model_type by @ryancee in #657
fix(cohere): restore quantized inference for 8-bit and 4-bit checkpoints by @beshkenadze in #650
feat(irodori-tts): add v2 model support with VoiceDesign and chunked DACVAE decode by @yoshphys in #660
feat: add Higgs Audio v2 — 3B Llama-backed TTS with voice cloning by @Kairos-a in #656
Remove librosa dependency by @lucasnewman in #662
Replace all soundfile calls with core equivalents by @lucasnewman in #663
Move misaki to an optional install to reduce dependency graph by @lucasnewman in #664
Improve performance of Parakeet TDT on longform content by @lucasnewman in #665
Fix Voxtral Realtime streaming and speed up the 4-bit path by ~3x by @iris-sfg in #661
feat(higgs_audio): add ReferenceContext for reusable encoded-reference state by @Kairos-a in #666
Fix Voxtral TTS tokenizer dependency contract by @lyonsno in #633
Remove pyloudnorm dependency by @lucasnewman in #667
Support concurrent requests to the server by @lucasnewman in #668
Add a standard model loading path for STS models by @lucasnewman in #670
Remove pydub dependency by @lucasnewman in #671
Clean up bare scipy usage by @lucasnewman in #672
Remove explicit tiktoken dependency by @lucasnewman in #673
docs: add Svara TTS (multilingual Indic) entry by @shreyaskarnik in #678
Fix Voxtral STT crash on eos_token_ids initialization by @contrapuntal in #677
feat: add Mel-Band-RoFormer architecture for vocal source separation by @xocialize in #654
Improved dep handling for mlx-lm by @lucasnewman in #683
Add MOSS-TTS-Nano by @lucasnewman in #676
docs: add shields.io badges and table of contents to README by @Gingiris in #680
Adjust Trendshift badge to README by @Blaizzy in #684
Add batching support for Fish Speech S2 Pro by @lucasnewman in #675
Add continuous batching support for Qwen3 TTS to the server by @lucasnewman in #674

New Contributors

@regcs made their first contribution in #635
@ryancee made their first contribution in #657
@Kairos-a made their first contribution in #656
@iris-sfg made their first contribution in #661
@lyonsno made their first contribution in #633
@contrapuntal made their first contribution in #677
@xocialize made their first contribution in #654
@Gingiris made their first contribution in #680

Full Changelog: v0.4.2...v0.4.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.4.3

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!