top of page
Writer's pictureHackers Realm

Convert Speech to Text using Python | Speech Recognition | Machine Learning Project Tutorial

Updated: May 31, 2023

Unlock the power of speech-to-text conversion with Python! This comprehensive tutorial explores speech recognition techniques and machine learning. Learn to transcribe spoken words into written text using cutting-edge algorithms and models. Enhance your skills in natural language processing and optimize your applications with this hands-on project tutorial. #SpeechToText #Python #SpeechRecognition #MachineLearning #NLP

Convert Speech to Text using Speech Recognition
Convert Speech to Text using Speech Recognition

In this project tutorial we will install the Google Speech Recognition module and covert real-time audio to text and also convert an audio file to text data.


You can watch the step by step explanation video tutorial down below


Project Information

The objective of the project is to convert speech to text in real time and convert audio file to text. It uses google speech API to convert the audio to text.


Libraries

  • speech_recognition

  • Google Speech API


We install the module to proceed

# install the module
!pip install speechrecognition
!conda install pyaudio

Requirement already satisfied: speechrecognition in c:\programdata\anaconda3\lib\site-packages (3.8.1) Collecting PyAudio Using cached PyAudio-0.2.11.tar.gz (37 kB) Building wheels for collected packages: PyAudio Building wheel for PyAudio (setup.py): started Building wheel for PyAudio (setup.py): finished with status 'error' Running setup.py clean for PyAudio Failed to build PyAudio Installing collected packages: PyAudio Running setup.py install for PyAudio: started Running setup.py install for PyAudio: finished with status 'error'


Now we import the module

# import the module
import speech_recognition as sr

We initialize the module

# initialize
r = sr.Recognizer()

Convert Speech to Text in Real time


We will convert real time audio from a microphone into text

while True:
    with sr.Microphone() as source:
        # clear background noise
        r.adjust_for_ambient_noise(source, duration=0.3)
        
        print("Speak now")
        # capture the audio
        audio = r.listen(source)
        
        try:
            text = r.recognize_google(audio)
            print("Speaker:", text)
            if text == 'quit':
                break
            except:
                print('Please say again!!!')

Speak now Speaker: welcome to the channel Speak now Speaker: testing speech recognition Speak now Speaker: quit

  • Microphone() - Receive audio input from microphone

  • adjust_for_ambient_noise(source, duration=0.3) - Clear any background noise from the real time input

  • listen(source) - Capture the audio from the source

  • recognize_google(audio) - Google Speech recognition function to convert audio into text

  • text == 'quit' - Condition to quit the while loop



Convert Audio to Text


Now we will process and convert an audio file into text

with sr.AudioFile('test.wav') as source:
    print("listening to audio")
    # capture the audio
    audio = r.listen(source)
    
    try:
        text = r.recognize_google(audio)
        print("Audio:", text)
    except:
        print('Error') 

listening to audio Audio: welcome to speech recognition

  • Displayed text is the same as the speech in the audio file

  • For larger audio files you need to split them in smaller segments for better processing


Final Thoughts

  • Very useful tool for converting real time recordings into text which can help in chats, interviews, narration, captions, etc.

  • You can also use this process for Emotional Speech recognition and further analyze the text for sentiment analysis.

  • The Google Speech recognition is a very effective and precise module, you may implement any other module to convert speech into text as per your preference.


In this project tutorial we have explored Convert Speech to Text process using the Google Speech Recognition module. We have installed the module and processed real time audio recording and an audio file converting into text data.


Get the project notebook from here


Thanks for reading the article!!!


Check out more project videos from the YouTube channel Hackers Realm

1,378 views

Comments


bottom of page