deepspeech-arabic

This is a preliminary experiment to build an end-to-end Arabic ASR using DeepSpeech engine (https://github.com/mozilla/DeepSpeech).

I used Common Voice Corpus 7.0 for training (https://commonvoice.mozilla.org/en/datasets). In this version, Arabic voices are 1,052 and the total length is 117 hours. However, the model is trained on just 30 hours and tested on 12 hours.

Training on a workstation with 4 RTX GPUs took less than 20 mins. WER is very high but it can be improved with more data. I also have not used a scorer (language model) to improve the result. For comparision with an alternative approach, I rerun the experiment on Kaldi to build a DNN-HMM based model (https://github.com/asrajeh/kaldi-arabic). Its performance is much better (WER 29%). DeepSpeech models require a lot of recording hours (>30) before reaching an acceptable WER.

You can download the model and example audio from here:

https://drive.google.com/file/d/1Gn9RvoSUhq5lDCAZHgOVxt8Ypdg6DdZH/view?usp=sharing

https://drive.google.com/file/d/1A8hxuSTwpYibh6Q0lblWNfDEJyzKPLrZ/view?usp=sharing

Testing

# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-venv/
source $HOME/tmp/deepspeech-venv/bin/activate

# Install DeepSpeech
pip3 install deepspeech

# Transcribe an audio file
deepspeech --model common_voice-v7.0-ar.pbmm --audio common_voice_ar_24033139.wav

Training

I used this crash course:

https://mozilla.github.io/deepspeech-playbook/

You need to preprocess the data (preprocess_cv.sh) and use Arabic alphabet (alphabet-ar.txt). Have a look at my training log file (common_voice-v7.0-ar.log).

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
LICENSE		LICENSE
README.md		README.md
alphabet-ar.txt		alphabet-ar.txt
common_voice-v7.0-ar.log		common_voice-v7.0-ar.log
preprocess_cv.sh		preprocess_cv.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

deepspeech-arabic

Testing

Training

About

Uh oh!

Releases

Packages

Languages

License

asrajeh/deepspeech-arabic

Folders and files

Latest commit

History

Repository files navigation

deepspeech-arabic

Testing

Training

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages