SpeechRecognition
Speech recognition is a technology that allows computers to understand and
process human speech. Python, with its simplicity and robust libraries, offers
several modules to tackle speech recognition tasks effectively. One of the most
popular libraries for this purpose is the SpeechRecognition library.
               With SpeechRecognition Library
In this section, we will base our speech recognition system on this tutorial.
SpeechRecognition library offers many transcribing engines like Google Speech
Recognition, and that's what we'll be using.
      Before we get started, let's install the required libraries:
      $ pip install SpeechRecognition pydub
       Open up a new file named speechrecognition.py, and add the following:
# importing libraries
import speech_recognition as sr
import os
from pydub import AudioSegment
from pydub.silence import split_on_silence
# create a speech recognition object
r = sr.Recognizer()
              The below function loads the audio file, performs speech recognition, and
               returns the text:
              # a function to recognize speech in the audio file
              # so that we don't repeat ourselves in in other functions
               def transcribe_audio(path):
                    # use the audio file as the audio source
                    with sr.AudioFile(path) as source:
                         audio_listened = r.record(source)
                         # try converting it to text
                          text = r.recognize_google(audio_listened)
                    return text
             Next, we make a function to split the audio files into chunks in silence:
# a function that splits the audio file into chunks on silence
# and applies speech recognition
def get_large_audio_transcription_on_silence(path):
    """
    Splitting the large audio file into chunks
    and apply speech recognition on each of these chunks
    """
    # open the audio file using pydub
    sound = AudioSegment.from_file(path)
    # split audio sound where silence is 700 miliseconds or more and
get chunks
    chunks = split_on_silence(sound,
          # experiment with this value for your target audio file
          min_silence_len = 500,
          # adjust this per requirement
          silence_thresh = sound.dBFS-14,
          # keep the silence for 1 second, adjustable as well
          keep_silence=500,
    )
    folder_name = "audio-chunks"
    # create a directory to store the audio chunks
    if not os.path.isdir(folder_name):
          os.mkdir(folder_name)
    whole_text = ""
    # process each chunk
    for i, audio_chunk in enumerate(chunks, start=1):
          # export audio chunk and save it in
          # the `folder_name` directory.
          chunk_filename = os.path.join(folder_name, f"chunk{i}.wav")
          audio_chunk.export(chunk_filename, format="wav")
          # recognize the chunk
          with sr.AudioFile(chunk_filename) as source:
              audio_listened = r.record(source)
                # try converting it to text
                try:
                       text = r.recognize_google(audio_listened)
                except sr.UnknownValueError as e:
                       print("Error:", str(e))
                else:
                       text = f"{text.capitalize()}. "
                       print(chunk_filename, ":", text)
                       whole_text += text
     # return the text for all chunks detected
     return whole_text
print(get_large_audio_transcription_on_silence("7601-291468-0006.wav"))
Implementing Speech Recognition with Python
basic implementation using the SpeechRecognition library involves several steps:
Audio Capture: Capturing audio from the microphone using PyAudio.
Audio Processing: Converting the audio signal into data that the SpeechRecognition library can work
with.
Recognition: Calling the recognize_google() method (or another available recognition method) on
the SpeechRecognition library to convert the audio data into text.
Pro_2
               import speech_recognition as sr
               # Initialize recognizer class (for recognizing the speech)
               r = sr.Recognizer()
               # Reading Microphone as source
               # listening the speech and store in audio_text variable
               with sr.Microphone() as source:
                  print("Talk")
                  audio_text = r.listen(source)
                  print("Time over, thanks")
                  # recoginze_() method will throw a request
                  # error if the API is unreachable,
                  # hence using exception handling
                  try:
                    # using google speech recognition
                     print("Text: "+r.recognize_google(audio_text))
                  except:
                    print("Sorry, I did not get that")
Speech Recognition in Python using Google Speech API
sudo pip install SpeechRecognition
                PyAudio: Use the following command for Linux users
sudo apt-get install python-pyaudio python3-pyaudio
                If the versions in the repositories are too old,
                install pyaudio using the following command
sudo apt-get install portaudio19-dev python-all-dev python3-all-dev
&&
sudo pip install pyaudio
pip install pyaudio
USB Device 0x46d:0x825: Audio (hw:1, 0)
Make a note of this as it will be used in the program.
Set Chunk Size: This basically involved specifying how many bytes of data we want to read at once.
Typically, this value is specified in powers of 2 such as 1024 or 2048
Set Sampling Rate: Sampling rate defines how often values are recorded for processing
Set Device ID to the selected microphone : In this step, we specify the device ID of the microphone
that we wish to use in order to avoid ambiguity in case there are multiple microphones. This also
helps debug, in the sense that, while running the program, we will know whether the specified
microphone is being recognized. During the program, we specify a parameter device_id. The
program will say that device_id could not be found if the microphone is not recognized.
Allow Adjusting for Ambient Noise: Since the surrounding noise varies, we must allow the program a
second or two to adjust the energy threshold of recording so it is adjusted according to the external
noise level.
Speech to text translation: This is done with the help of Google Speech Recognition. This requires an
active internet connection to work. However, there are certain offline Recognition systems such as
PocketSphinx, that have a very rigorous installation process that requires several dependencies.
Google Speech Recognition is one of the easiest to use.
SPEECH HINDI
pip install SpeechRecognition
pip install PyAudio
pip install pipwin
pipwin install pyaudio
               WAP Speech Hindi
               # import required module
               import speech_recognition as sr
               # explicit function to take input commands
               # and recognize them
               def takeCommandHindi():
                         r = sr.Recognizer()
                         with sr.Microphone() as source:
                                # seconds of non-speaking audio before
                                # a phrase is considered complete
                                print('Listening')
                                r.pause_threshold = 0.7
                                audio = r.listen(source)
                                try:
                                         print("Recognizing")
                                         Query = r.recognize_google(audio, language='hi-In')
                                        # for listening the command in indian english
                                        print("the query is printed='", Query, "'")
                                # handling the exception, so that assistant can
                                # ask for telling again the command
                                except Exception as e:
                                        print(e)
                                        print("Say that again sir")
                                        return "None"
                                return Query
               # Driver Code
               # call the function
               takeCommandHindi()