EXERCISE 2
Speech Recognition in Python using
    CMU Sphinx
    “Hey, Siri!”, “Okay, Google!” and “Alexa playing some music” are some of
    the words that have become an integral part of our life as giving voice
    commands to our virtual assistants make our life a lot easier. But have
    you ever wondered how these devices are giving commands via
    voice/speech?
    Do applications understand your voice? How does the computer even
    decode this if it only understands 0/1?
    The answer is simple: it uses Speech Recognition software to decode the
    user     input   received    as    speech/voice      using     the  device’s
    microphone. Speech Recognition software to decode the user input
    received as speech/voice using the device’s microphone. the task of this
    software is to convert the speech to a string(text) so that the computer can
    then decode it.
    One such Toolkit is CMU Sphinx which is an open-source toolkit used for
    speech recognition, it also has a lightweight recognizer library
    called Pocketsphinx which will be used to recognize the speech. This
    library is a great resource especially when you are offline as when you
    have internet access you should prefer Google API with speech
    recognition due to higher precision. but when you are building a project
    that works offline or uses speech on an offline embedded device,
    use pocketsphinx.
    Recognition Process
    Let’s discuss how this library works from behind to actually recognize our
    voice, It takes a waveform and then splits it according to utterances by
    silence then traverses and tries to find out what is being said in each
    utterance for accomplishing this task it takes all possible combinations of
    words and try to match them with audio choosing the best matching
    combination.
    Installation of modules
    Since pocketsphinx is an external library i.e. its not present as an inbuilt
    entity in python we would install it to our machines using pip installer and
    then using import to invoke all the functionalities of this library,
    Now open your terminal and type the following command
    NOTE- make sure that you have latest version of pip installed if not then
    type following
    python -m pip install --upgrade pip setuptools wheel
    If you have latest version of pip then proceed directly and type the
    following code into your terminal.
    pip install pocketsphinx
    Now that you have installed pocketsphinx in your machine lets move
    forward to more.
    Prerequisites
    There are two prerequisite library which is used along side with
    pocketsphinx they are :-
    1. SpeechRecognition – used for speech recognition ,with support for
       several engines and APIs, online and offline.
    2. PyAudio-used to play and even record audio in python.
    Now it is recommended to install these two library using pip install
    command:-
    pip install SpeechRecognition
    brew install portaudio
    pip install pyaudio
    Now installation of all required external library is completed so lets move
    forward to code.
    LiveSpeech
    It is an external iterator class available in pocketsphinx which can be used
    for continuous recognition or keyword search from a microphone.
    Here is the code for continuous recognition.
   Python3
     # import LiveSpeech
     from pocketsphinx import LiveSpeech
     for phrase in LiveSpeech():
         # here the result is stored in phrase which
         # ultimately displays all the words recognized
         print(phrase)
     else:
         print("Sorry! could not recognize what you said")
    Output :
    We used LiveSpeech in a basic for in loop to fetch continuous speech
    input from user using the device microphone then we store the converted
    string into phrase and display each word uttered by the user.
    Keyword searching
    We use an variable named speech of type pocketsphinx.LiveSpeech ,
    In which we invoke the class LiveSpeech with arguments keyphrase i.e.
    the keyword to be searched and kws_threshold then we used an for in
    loop on speech which continuously looks for user input in form of voice if
    the user utters the word ‘forward’ then it is printed along with segments.
   Python3
     # importing livespeech
     from pocketsphinx import LiveSpeech
     speech = LiveSpeech(keyphrase='forward', kws_threshold=1e-20)
     # an for in loop to iterate in speech
     for phrase in speech:
             # printing if the keyword is spoken with segments along side.
         print(phrase.segments(detailed=True))
    Output :
    Test program
    First of all import speech_recognition with referencing it as some
    reference name aud now you can recognize speech using your code.
    Now fetch audio from devices microphone and store in variable reference
    of type speech_recognition.Recognizer to recognize the audio and
    convert to text. After that define microphone as your source of input and
    define an variable reference say audio to listen i.e it takes user input of
    speech and stores it there, then we use invoke sphinx using try we try
    printing what user said here we invoke recognize_sphinx and pass
    argument audio, now the work of this class to convert what user said (in
    form of speech ) to text form and display it in console simply
    called Recognition.
    If the code is unable to accept voice input due to unclear voice then we
    throw an exception for unclear voice and for RequestError tool.
   Python3
     import speech_recognition as aud
     # fetch audio from devices microphone
     # and store in variable reference of type speech_recognition
     a = aud.Recognizer()
     # declaring device microphone as the source to take audio input
     with aud.Microphone() as source:
         print("Say something!")
         # variable audio prints what user said in text format the end
         audio = a.listen(source)
     # invoking sphinx for speech recognition
     try:
          # printing audio
          print("You said " + a.recognize_sphinx(audio))
     except aud.UnknownValueError:
         # if the voice is unclear
         print("Could not understand")
     except aud.RequestError as e:
         print("Error; {0}".format(e))
    Output:
Conclusion
This winds up our topic of discussion of Speech recognition using CMU
Sphinx , there lot of more applications of this useful library.