Offline STT & VAD Service (Rust, Tonic) (Q1:2025)
-
Updated
Nov 1, 2025 - Rust
Offline STT & VAD Service (Rust, Tonic) (Q1:2025)
It's typescript based VAD that uses silero ai VAD under the hood. It's highly robust for Voice Activity Detection. It only works in the browser.
🎤 Speech recognition experiments: Whisper, Wav2Vec, Silero ensemble pipelines
Rust implementation of the Silero Voice Activity Detection (VAD) model
BhashaBlend (AKA SynthoTranslate) is a video dubbing tool addressing language and hearing impairments in Indian education. It utilizes NLP, speech processing, and computer vision to provide multilingual content, sign language transcription, and voice cloning and translation for inclusive learning.
Convert PDFs/EPUBs to audiobooks with synchronized text highlighting using AI TTS models
Automatically cuts out parts without speech from given video, making it shorter and more enjoyable to watch (look examples). Usage on google.collab in several clicks.
Доработка ИИ агента для кейса на Весенней экономической школе Сбера и НИУ ВШЭ. Решение заняло 3-е место
Распознание и озвучивание голосовым движком текста с экрана.
Self-hosted Text-to-Speech (TTS) server for OpenClaw
This is a python project. We compare the metrics of 2 already trained AI models - WebRTC & Silero.
Modular Swift package for on-device voice activity detection on Apple platforms using CoreML and Silero VAD.
🔊 Clone voices easily with this lightweight Python app that synthesizes audio using a simple voice-cloning workflow.
Add a description, image, and links to the silero topic page so that developers can more easily learn about it.
To associate your repository with the silero topic, visit your repo's landing page and select "manage topics."