You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Created an ASR (Automatic Speech Recognition) system that takes in individual recordings. Each recording represents a sentence composed of 5-10 English language digits, separated by adequate pauses. The system involves segmenting the sentence using a classifier, differentiating between background and foreground sounds.
Functionality for speech data processing including time alignment, encoding with speech encoders (tokenizers) and data preprocessing of common datasets
Prototype of an intelligent research agent capable of literature retrieval, summarization, and contextual reasoning — a foundation for scientific automation tools.
This project presents Hera, an Operating System level voice recognition package that understands voice commands to perform actions to simplify the user’s workflow. We propose a modernistic way of interacting with Linux systems, where the latency of conventional physical inputs are minimized through the use of natural language speech recognition.
ScrAIbe Assistant is designed to leverage Whisper for precise audio processing and local LLMs via Ollama for efficient summarization. This tool is perfect for tasks such as taking notes from team meetings or lectures, offering a secure environment where no data—be it text, audio, or otherwise—leaves your local machine.
This project involves building a gradio website that accepts user audio input. It then transcribes the audio into Persian text and analyzes the speech to label its sentiment as positive or negative.