🎤 Enable seamless audio processing with "sherpa-onnx," supporting speech recognition, synthesis, and more across multiple platforms.
-
Updated
Nov 11, 2025 - C++
🎤 Enable seamless audio processing with "sherpa-onnx," supporting speech recognition, synthesis, and more across multiple platforms.
A high-performance API server that provides OpenAI-compatible endpoints for MLX models. Developed using Python and powered by the FastAPI framework, it provides an efficient, scalable, and user-friendly solution for running MLX-based vision and language models locally with an OpenAI-compatible interface.
🎤 Transcribe audio and video files into text or subtitles effortlessly on Google Colab using OpenAI Whisper, with no installation needed.
🔊 Train audio models efficiently with MiMo-Audio-Training, a toolkit designed for straightforward implementation and enhanced performance in audio processing tasks.
🎥 Translate and generate subtitles for any video in real-time, enhancing your viewing experience across multiple platforms with privacy-focused processing.
🎙️ Record and transcribe audio effortlessly using AI technologies for clear and accurate text output with this simple web application.
🎙️ Transcribe podcast episodes quickly by pasting links and API keys, generating human-readable transcripts with timestamps for easy editing.
🎤 Control your world with Jarvis, a voice-activated AI assistant that simplifies tasks and enhances productivity.
📚 Transform learning with TatvaX, an AI platform providing personalized education in 8 Indian languages, breaking down language barriers for millions.
🗣️ Elevate your workflow with NeuraVoice, an AI desktop assistant that combines speech recognition and local LLM responses for seamless task automation.
🎤 Transform spoken phrases into OWL ontologies, making it easy to create structured data from voice. Ideal for developers and researchers alike.
🔢 Extract numbers swiftly from JSON, YAML, CSV, TOML, INI, and ENV files at 1.5M numbers per second, 100x faster than manual searching.
🤖 Learn motion imitation with MimicKit, a framework offering advanced methods to train motion controllers using state-of-the-art algorithms and techniques.
📖 Explore the Quran with an AI-powered Next.js app, offering semantic search, tafsir integration, and enhanced study features for deeper understanding.
AI-powered tools and chatbots that personalize German learning and automate feedback.
🌟 Simplify language translation and communication with echo-zxx, a powerful tool for seamless text conversion across multiple languages.
🎤 Enhance your voice-to-text transcriptions with WhisperClip, prioritizing privacy and featuring AI improvements for macOS users.
🎉 Kickstart your Web3 journey by showcasing your project from the Women Web3 Wave #5 Demoday. Join us to drive change and innovation together.
🤖 Run state-of-the-art Machine Learning models in Dart with transformers_dart—cross-platform, serverless, and based on Hugging Face's transformers.
🤖 Explore deep learning architectures like ANN, CNN, RNN, and LSTM to enhance your understanding of machine learning and neural networks.
Add a description, image, and links to the speech-recognition topic page so that developers can more easily learn about it.
To associate your repository with the speech-recognition topic, visit your repo's landing page and select "manage topics."