Resemble AI reposted this
🎉 Today we are publicly releasing Chatterbox Multilingual v3 and alongside it, NVIDIA published a NIM-optimized version in their build catalog. Our last voice model, DramaBox, had watermarking enabled by default, and Multilingual V3 does the same. Voice generation without provenance pushes the cost of misuse onto everyone downstream, so we build it in from the start. V3 is also built around something else we kept hearing from a lot of you building with V2: - my demo sounds great but production sounds wrong - the dialect matters and the general model can't tell the difference - I need this to work in Hindi or Portuguese, not just English That's what V3 is for. 25 supported languages including 4 dialects and 6 tuned single-language models, same 0.5B Llama backbone as V2, MIT licensed. Training mixture expanded from 25,635 to 36,692 hours, 275ms time to first byte on a single H100, 2-39x throughput on NIM over the unoptimized PyTorch baseline. The Single Language Pack is six dedicated models for the most requested language improvements. Chatterbox has over 13M downloads on Hugging Face. We’re proud and grateful for how much this community builds with these models. Thank you to the NVIDIA team that collaborated on the NIM launch - Adi Margolin, Maryam Motamedi, Mandar Padmawar and Rahul Mittal. Multilingual V3: Try it on NVIDIA NIM: https://lnkd.in/g__MFJg7 Try it on Hugging Face: https://lnkd.in/g7BTyQ6K Single Language Packs: Brazilian Portuguese Finetune https://lnkd.in/ggtvPW_m European Portuguese Funetune https://lnkd.in/gUSYvKVy Latin American Spanish Finetune https://lnkd.in/gNHhUWwm European Spanish Finetune https://lnkd.in/gJVAKPgB Mandarin Chinese Finetune https://lnkd.in/gUKjSEmF Hindi Funetunue https://lnkd.in/gruP-a4V