-
Updated
Mar 22, 2025 - JavaScript
speech-ai
Here are 19 public repositories matching this topic...
Production-ready examples for Brainiall Speech AI APIs — Pronunciation Assessment, STT, TTS. Python, JavaScript, curl, and MCP configs.
-
Updated
Mar 10, 2026 - Python
End-to-end speaker diarization and transcription pipeline using Whisper, VAD, and clustering in Python.
-
Updated
Feb 7, 2026 - Jupyter Notebook
Open source AI voice calling agent for Twilio phone calls, built with FastAPI, Google ADK, and Gemini Live API
-
Updated
Mar 24, 2026 - Python
A voice-based AI chat interface built with Next.js and ElevenLabs. Start and stop real-time conversations with an animated UI that reflects agent status. Fully responsive and deployable via Vercel with environment-based agent configuration.
-
Updated
Jul 14, 2025 - TypeScript
Enterprise-Grade Secure ASR Diarization Pipeline - HIPAA-compliant speech processing service combining automatic speech recognition with speaker diarization. Features modular architecture, comprehensive security, and production-ready deployment.
-
Updated
Mar 23, 2026 - Python
Voice agent prototype for structured clinical interviewing, with VAD-based interruption handling, modular ASR/LLM/TTS backends, and dialogue workflow control.
-
Updated
Mar 14, 2026 - Python
Provide Whisper-based audio transcription and translation with lightweight C++ libraries for easy integration into LLM projects.
-
Updated
Mar 31, 2026 - C++
Open-source real-time Voice AI infrastructure in Go. Stream audio via WebRTC or WebSocket, connect STT → LLM → TTS pipelines, and build scalable voice agents and conversational AI applications.
-
Updated
Mar 20, 2026 - Go
MCP Server for Brainiall Speech AI - pronunciation assessment, speech-to-text, and text-to-speech
-
Updated
Mar 11, 2026 - Python
🇺🇦 Ukrainian RAD-TTS++ models (decoder + models with 3 voices) and HiFiGAN model
-
Updated
Feb 27, 2025
A voice-based AI chat interface built with Next.js and ElevenLabs. Start and stop real-time conversations with an animated UI that reflects agent status. Fully responsive and deployable via Vercel with environment-based agent configuration.
-
Updated
Jun 24, 2025 - TypeScript
🇺🇦 Open Source Ukrainian Text-to-Speech datasets
-
Updated
Feb 24, 2025 - Python
A unified benchmarking framework for evaluating Voice AI agents across conversational quality, audio realism, latency metrics, and safety guardrails with scalable multi-language stress testing.
-
Updated
Feb 26, 2026 - Python
Just a simple multimodal avatar interaction platform
-
Updated
Mar 11, 2026 - JavaScript
A Docker-based OpenAI-compatible Text-to-Speech API server powered by Kyutai's TTS models with GPU acceleration support.
-
Updated
Jul 12, 2025 - Python
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
-
Updated
Oct 16, 2025
Rapida is an open-source, end-to-end voice AI orchestration platform for building real-time conversational voice agents with audio streaming, STT, TTS, VAD, multi-channel integration, agent state management, and observability.
-
Updated
Mar 31, 2026 - Go
Improve this page
Add a description, image, and links to the speech-ai topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the speech-ai topic, visit your repo's landing page and select "manage topics."