This repository contains a complete machine learning pipeline for Speech Emotion Recognition (SER) using Deep Neural Networks (DNNs). The pipeline is built with reproducibility and experiment tracking in mind, utilizing DVC for data and pipeline management, and MLflow (integrated with DagsHub) for experiment logging. Please see the DagsHub repository.
The goal of this project is to classify emotions from speech audio using a Convolutional Neural Network (CNN)-based architecture. The system supports five emotion classes and is trained and evaluated using stratified 5-fold cross-validation.
├── data/
│ ├── modified_shemo.json # Preprocessed metadata
│ └── npy/ # Numpy feature arrays
├── models/ # Trained Keras models per fold
├── results/ # Evaluation reports and confusion matrices
├── runs/ # Run ID files for MLflow linkage
├── src/
│ ├── preprocess.py # Feature extraction and preprocessing
│ ├── train.py # Training with Optuna hyperparameter tuning
│ └── test.py # Evaluation and logging of test results
├── dvc.yaml # DVC pipeline configuration
├── params.yaml # Hyperparameters and path configuration
└── README.md # Project documentation
- Extracts features from the Shemo dataset and saves them in
data/npy/. - Controlled by parameters in
params.yaml > preprocess.
- Performs 5-fold cross-validation.
- Uses Optuna for hyperparameter tuning within each fold.
- Logs the best model for each fold to MLflow and DagsHub.
- Controlled by parameters in
params.yaml > train.
- Loads each fold’s best model and evaluates it on the test set.
- Logs metrics and confusion matrices to MLflow.
- Controlled by parameters in
params.yaml > test.
This project uses DVC to track data and pipeline stages. To reproduce the entire pipeline:
git clone https://github.com/your-username/project-dnn-ser-pipeline.git
cd project-dnn-ser-pipeline
dvc pull # fetch data (if remote is configured)
dvc repro # run the pipeline from scratchWe use MLflow integrated with DagsHub to track experiments, parameters, models, and metrics. Each fold in the cross-validation is logged as a nested run.
- Python 3.10+
- TensorFlow, NumPy, Scikit-learn, Optuna, DVC, MLflow, etc.
Install dependencies using:
pip install -r requirements.txtHyperparameters and paths are managed via params.yaml. Example:
train:
inputs_path: data/npy/
models_path: models/
runs_path: runs/
n_trials: 20For questions or feedback, please open an issue or contact aliyzd95.