Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!
-
Updated
Oct 19, 2024 - HTML
Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!
Rhasspy voice assistant for offline home automation
Real-time transcription using faster-whisper
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
This is the end-to-end Speech Recognition neural network, deployed in Keras. This was my final project for Artificial Intelligence Nanodegree @udacity.
Pytorch implementation of subband decomposition
ASR 2Pass onnxruntime and websocket server, based on FunASR(https://github.com/alibaba-damo-academy/FunASR).
whisper-cpp-serve Real-time speech recognition and c+ of OpenAI's Whisper model in C/C++
StageMate is the smart assistant for your presentation. It will cover all aspects of your pitch from skipping slides to reminding you if you miss some major point.
Built a deep neural network that functions as part of an end-to-end automatic speech recognition (ASR) pipeline.
VietGPT VoiceBot: Chatbot automatically recognizes Vietnamese voice and uses the ChatGPT API for natural language interaction.
Python platform for working with LLMs
Speakify is a web application that uses Edge TTS to convert text to speech using a variety of voices.
A MATLAB implementation of CHiME4 baseline Beamformit
webpage for maintaining the list of openly available DL, ML, RL, Vision, NLP, Optimization courses
A mobile web application that helps you convert spoken words to sharable/editable text 🎊
This App allows users to convert their speech into text and send that text as a message. It records blobs in realtime! After every 10 seconds recorded blob is sent to server and there it is converted into text and send as a message to other user.
基于Dolphin模型的东方语言音视频转字幕api及webui
Open Source Wearable Microphone Array Glasses for Multi-Speaker Speech Recognition
Add a description, image, and links to the speech-recognition topic page so that developers can more easily learn about it.
To associate your repository with the speech-recognition topic, visit your repo's landing page and select "manage topics."