Highlights
- Pro
Stars
MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressiveness, and complex real‑world scenario…
MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis. It features long-context modeling, flexible speaker control, and multilingual support, while enablin…
AcademiCodec: An Open Source Audio Codec Model for Academic Research
Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Models
Implementation of the paper "Self-supervised Learning with Random-projection Quantizer for Speech Recognition" in Pytorch.
Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.
🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷