Highlights
- Pro
Stars
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis. It features long-context modeling, flexible speaker control, and multilingual support, while enablin…
An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.
A simulated operating system design for AI Agents to interact with the world
FinOps and cloud cost optimization tool. Supports AWS, Azure, GCP, Alibaba Cloud and Kubernetes.
Implementation of "SpecRNet: Towards Faster and More Accessible Audio DeepFake Detection" paper
lokkelvin2 / dc_tts_GUI
Forked from Kyubyong/dc_ttsGUI Wrapper for 'A TensorFlow Implementation of DC-TTS: yet another text-to-speech model'
axelspringer / ForwardTacotron
Forked from fatchord/WaveRNN⏩ Generating speech in a single forward pass without any attention!