Stars
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.
Instant, controllable, local pre-trained AI models in Rust
Convolutional neural network model for video classification trained on the Kinetics dataset.
Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits for the end of the source utterance to start translating--- H…
google / swift
Forked from swiftlang/swiftThe Swift Programming Language
An implementation of the diffusers api in Rust
Googles NotebookLM but local
Tutorial for Porting PyTorch Transformer Models to Candle (Rust)
Rust wrapper for Microsoft's ONNX Runtime (version 1.8)
J-Moshi: A Japanese Full-duplex Spoken Dialogue System
A fully in-browser privacy solution to make Conversational AI privacy-friendly
An LLM interface (chat bot) implemented in pure Rust using HuggingFace/Candle over Axum Websockets, an SQLite Database, and a Leptos (Wasm) frontend packaged with Tauri!
Simple high-throughput inference library