You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
🧑🏻🎓 📑 October'20 - April'21. Group uni project. The project topic is Speech-to-Text Assessment Tool. It is a research-type project, most of the documentation is in a private GitLab repository.
Created an ASR (Automatic Speech Recognition) system that takes in individual recordings. Each recording represents a sentence composed of 5-10 English language digits, separated by adequate pauses. The system involves segmenting the sentence using a classifier, differentiating between background and foreground sounds.
The project,being part of Kagglex BIPOC Mentorship Program final project, aims to train two separate Hindi ASR models using the Facebook Wav2Vec2 (300M parameters) and OpenAI Whisper-Small models, respectively. The goal is to compare their performance, with a target WER of less than 13%, across various Hindi accents and dialects.
Functionality for speech data processing including time alignment, encoding with speech encoders (tokenizers) and data preprocessing of common datasets
Prototype of an intelligent research agent capable of literature retrieval, summarization, and contextual reasoning — a foundation for scientific automation tools.
This project presents Hera, an Operating System level voice recognition package that understands voice commands to perform actions to simplify the user’s workflow. We propose a modernistic way of interacting with Linux systems, where the latency of conventional physical inputs are minimized through the use of natural language speech recognition.