Detecting Speed and Tempo Alterations in Speech Recordings
-
Updated
Feb 11, 2026 - Python
Detecting Speed and Tempo Alterations in Speech Recordings
Python implementation of the article "EMOVOME Database: Advancing Emotion Recognition in Speech Beyond Staged Scenarios"
A database of challenging voice utterances collected by the Biometrics Vision and Computing (BVC) group.
A convolutional neural network for gender classification, which achieved an F1-score of 94.3% when tested on the RAVDESS dataset. Created as postgraduate coursework, the report is included. The report also discusses Sodiq Adebiy's CNN, which I'd recommend looking at to anyone interested in emotion classification.
RU directed speech classifier (ruElectra, synthetic ASR noise)
CNN Based Approach for Audio File Classification. Contains Notebooks Illustrating Data Preprocessing, Feature Extraction, Model Training, & Model Inference Workflows & Overall Pipeline
This repository contains the code for the INTERSPEECH2025 paper: "Speech and Text Foundation Models for Depression Detection: Cross-Task and Cross-Language Evaluation"
This project aims to perform Emotion Recognition in Speech using Deep Neural Networks (DNNs)
Qafar-af and Amharic voice Command Recognition project to control the movement of wheelchair
Code for audio-based autism spectrum disorder (ASD) classification using Transformer models, machine learning baselines, and SHAP analysis.
This project implements a speech emotion classification system using neural networks and genetic algorithms for optimization. The system classifies emotions such as calm, happy, sad, angry, fearful, surprise, and disgust from speech audio using the RAVDESS dataset.
Fall 2021 Introduction to Deep Learning - Homework 3 Part 2 (RNN-based phoneme recognition)
In this notebook, we aim to recognize speech commands using classification. For this purpose, we used the SPEECHCOMMANDS dataset and the deep convolutional model M5. The code is written in Python and designed for the PyTorch platform.
This project represents my research on dementia classification using audio data.
Yamnet for speech classification using CPP and ONNX-runtime-2025高通边缘智能创新应用大赛入围决赛方案
This repository contains code for all assignments in the Multimedia Computing and Applications (CSE563) course.
A Python implementation of the Iterative Feature Normalization algorithm
Speech Classification using Continuous Attention Mechanisms
In this challenge, the goal is to learn to recognize which of several English words is pronounced in an audio recording. This is a multiclass classification task.
Add a description, image, and links to the speech-classification topic page so that developers can more easily learn about it.
To associate your repository with the speech-classification topic, visit your repo's landing page and select "manage topics."