You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three different self-supervised models, Wav2vec (2019, 2020), HuBERT (2021) and WavLM (2022) pretrained on a corpus of English speech that we will use in various ways to perform phoneme recognition for different languages wi…
Swarms supports the Common Voice Project from Mozilla! This repo contains the script to create recording jobs for the project on our platform. The generated datasets will follow!
Python implementation of end-to-end Automatic Speech Recognition using fully convolutional architectures (Jasper & QuartzNet), designed for efficient, scalable speech-to-text.
A fine-tuned Wav2Vec2-based Automatic Speech Recognition (ASR) system with data augmentation, efficient training, and transcription capabilities. Supports local and Mozilla Common Voice datasets, with evaluation via Word Error Rate (WER). 🚀