Stars
Python client for the Gradium Voice AI api.
Googles NotebookLM but local
Instant, controllable, local pre-trained AI models in Rust
A collection of optimisers for use with candle
Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Simple and efficient time representation in Rust.
Fine-tuning Moshi/J-Moshi on your own spoken dialogue data
python bindings for symphonia/opus - read various audio formats from python and write opus files
Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
A Fish Speech implementation in Rust, with Candle.rs
J-Moshi: A Japanese Full-duplex Spoken Dialogue System
Simple high-throughput inference library