Skip to content

aaivu/KuralHub

πŸŽ™οΈ KuralHub: A Comprehensive Review of Speech Emotion Recognition (SER) Datasets

Latest Version Interspeech 2026 Website License Contributions


πŸ”₯ What is KuralHub?

KuralHub is a comprehensive repository that reviews and benchmarks Speech Emotion Recognition (SER) datasets across multiple languages.
It provides detailed metadata, access links, and benchmarks using fine-tuned monolingual models for SER.

πŸ“„ Paper: Accepted at Interspeech 2026 (to appear) 🌐 Website: https://aaivu.github.io/KuralHub/


πŸ—‚ Survey Organisation

KuralHub/
│── datasets/             # Language-specific datasets
β”‚   β”œβ”€β”€ english/
β”‚   β”‚   β”œβ”€β”€ README.md     # Overview of English SER datasets
β”‚   β”‚   β”œβ”€β”€ ravdess.md    # Dataset-specific details
β”‚   β”œβ”€β”€ spanish/
β”‚   β”‚   β”œβ”€β”€ README.md
β”‚   β”‚   β”œβ”€β”€ dataset1.md
....

πŸ“Š SER Datasets Coverage

This survey covers 70+ languages (with 29 benchmarked), including open-source and restricted datasets.
If a language has no available dataset, it is marked accordingly.

Language Coverage


πŸš€ Benchmarks

We fine-tune pre-trained SER models on datasets individually and report their performance.

Performance by Datasets

Model Dataset Performance

Average Performance by Languages

Model Language Performance


πŸ“₯ How to Use

  1. Browse Datasets: Navigate to datasets/ for language-specific SER datasets.
  2. Download Datasets: Follow access links in each dataset file.
  3. Run Benchmarks: Check benchmarks/ for model performance.

🎯 Contribute to KuralHub

πŸ’‘ Know of a missing dataset? Help us expand KuralHub!
πŸ“© Submit a pull request or open an issue with new datasets.

πŸ“– Contribution Guidelines


πŸ“œ Citation

If you are using our research findings, please cite the following paper:

Citation details will be finalized once the paper is published.

@inproceedings{kuralhub2026,
  title     = {KuralHub: A Comprehensive Review of Speech Emotion Recognition Datasets},
  author    = {Thavarasa, Luxshan and Thevakumar, Jubeerathan and Sivatheepan, Thanikan and Thayasivam, Uthayasanker},
  booktitle = {Interspeech},
  year      = {2026},
  note      = {To appear}
}

πŸ“¬ Contact

🏷️ Name πŸ“§ Email πŸ”— LinkedIn πŸ“š Google Scholar
Luxshan Thavarasa luxshan.20@cse.mrt.ac.lk LinkedIn β€”
Jubeerathan Thevakumar jubeerathan.20@cse.mrt.ac.lk LinkedIn β€”
Thanikan Sivatheepan thanikan.20@cse.mrt.ac.lk LinkedIn β€”
Uthayasanker Thayasivam (supervisor, corresponding author) rtuthaya@cse.mrt.ac.lk LinkedIn β€”

All authors are with the Department of Computer Science & Engineering, University of Moratuwa, Sri Lanka.

πŸ™ Acknowledgment

We would like to thank Dr. Uthayasanker Thayasivam for his guidance as my supervisor, Braveenan Sritharan for his mentorship, and all the dataset owners for making their datasets available for us through open access or upon request. Your support has been invaluable.

About

An extensive collection of Speech Emotion Recognition (SER) datasets across multiple languages, including English, Mandarin, Hindi, Spanish, Tamil, Arabic, and more. Perfect for training emotion detection models in diverse linguistic and cultural contexts.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Contributors