You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
🤖 AI Chatbot with Voice Interface - A Flask web app featuring Groq-powered chat, voice input/output, and theme support. Combines natural language processing with speech synthesis for an interactive chat experience. #Python #Flask #AI #VoiceInterface
This project implements a speech emotion classification system using neural networks and genetic algorithms for optimization. The system classifies emotions such as calm, happy, sad, angry, fearful, surprise, and disgust from speech audio using the RAVDESS dataset.
Al MOM is an Al-powered meeting intelligence platform that delivers real-time transcription, speaker recognition, and multi-LLM summaries using FastAPI, Whisper, Groq, and OpenRouter for intelligent meeting insights.
WhisperX ASR is a FastAPI-based application for automatic speech recognition. It transcribes audio files to text using WhisperX, supports multiple languages, batch processing, and offers both a web UI and REST API.
An example project that provides a web interface to real-time speech-to-text service on a browser with Azure real-time speech-to-text service and Socket.IO.
Jarvis AI Assistant is a comprehensive desktop application that transforms your computer into an intelligent, voice-controlled environment. Built with Python and modern web technologies, it provides hands-free access to system functions, health monitoring, information retrieval, and entertainment via natural voice commands and biometric security.
Pandore offers a set of tools that facilitate the most common corpus processing tasks for digital humanities research. Automatic pipelines for a set of tasks are also available
Medibot is a voice-enabled medical AI assistant using RAG for accurate healthcare conversations. Evolved from my text-based chatbot, it now understands spoken questions and responds with voice answers, making medical guidance more accessible through intuitive multimodal interaction.
A web application for real-time voice transcription and speech-to-text conversion. Supports multiple languages and includes features like audio visualization, text-to-speech, word count, and easy export options.
A smart AI-powered platform that detects emotions from student voice input, classifies their intensity, prioritizes critical cases, and responds via an intelligent chatbot.
This project focuses on real-time Speech Emotion Recognition (SER) using the "ravdess-emotional-speech-audio" dataset. Leveraging essential libraries and Long Short-Term Memory (LSTM) networks, it processes diverse emotional states expressed in 1440 audio files. Professional actors ensure controlled representation, with 24 actors contributing.
WALL-E is a Python-based AI voice assistant that listens to commands and performs tasks like searching Wikipedia, opening websites, playing songs, telling jokes, and more. It uses speech recognition and text-to-speech to create a smooth, hands-free experience.
A simple web application to help users practice French pronunciation: record your voice, compare it against a reference phrase, get quick feedback, and iterate.