-
Amazon
- London, UK
- https://alessiofalai.it
Stars
Open-source text-to-speech model from KRAFTON trained exclusively on public speech data, with curated datasets and reproducible training support.
Comprehensive benchmark suite comparing pitch detection algorithms across multiple datasets.
A real-time and multilingual speech translation model
An autonomous agent for deep financial research
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Our first fully AI generated deep learning system
Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…
A TTS that fits in your CPU (and pocket)
A highly compressive and high-quality neural audio codec for speech models.
Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…
LLM Council works together to answer your hardest questions
Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages
MiMo-Audio: Audio Language Models are Few-Shot Learners
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
Our library for RL environments + evals
Corpus of resources for multimodal machine learning with physiological signals (mmps).
Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.
Voice Activity Detector (VAD) : low-latency, high-performance and lightweight
A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.
Magenta RealTime: An Open-Weights Live Music Model
🔥 Clone and recreate any website as a modern React app in seconds