fbpx
Get In Touch
1201 3rd Avenue Seattle, WA 98101, US
(HQ) Av. Punto Sur 31, Tlajomulco de Zúñiga, Jal 45050, MX
Carrera 11B # 99 - 25, Btá, 110221, CO
Let's talk
hello@inmediatum.com
Ph: +1 (650) 603 0883
Sales attention M - F 9am - 5pm (CT)
Get support
Careers
Endless inspiration and meaningful work
See open positions
Back

Speech Recognition with Python 3

Voice recognition refers to the automatic recognition of human speech. Voice recognition is one of the most important tasks in the domain of human computer interaction. For example, the famous Alexa and Google home that you may have used are based on this scheme. In this article we will know step by step how to achieve speech Recognition with Python 3, but first…

What’s Speech Recognition with Python 3?

Speech recognition is a technology that allows spoken input into systems. You talk to your computer, phone, or device and it uses what you said as input to trigger some action.

Installing SpeechRecognition Library

$ pip install SpeechRecognition

Speech Recognition from Audio Files (Code)

In this code have a audio from file in the computer

import speech_recognition as sr

def audioTranscript(audio, lang):
        r = sr.Recognizer()
        with sr.AudioFile(audio) as source:
               audio_data = r.record(source)
               text = r.recognize_google(audio_data, language=lang)
         return text
audio = "./assets/test3.m4a"
lang = "es"
audio_str = audioTranscript(audio, lang)
print(audio_str)

Speech Recognition from Microphone(Code)

import speech_recognition as sr

def audioTranscript():
        r = sr.Recognizer()
        with sr.Microphone() as source:
            audio_data = r.record(source, duration=5)
            text = r.recognize_google(audio_data)
            return text
audio_str = audioTranscript()
print(audio_str)

To convert speech to text, the only class we need is the Recognizer class of the speech_recognition module. Depending on the underlying API used to convert speech to text, the Recognizer class has the following methods:

  • Recognize_bing (): use API de Microsoft Bing Speech
  • Recognize_google (): use API de Google Speech
  • Recognize_google_cloud (): use API de Google Cloud Speech
  • Recognize_houndify (): use API Houndify de SoundHound
  • Recognize_ibm (): use API de voz a texto de IBM
  • Recognize_sphinx (): use API de PocketSphinx

Conclusion

Voice recognition has several useful applications in the domain of human computer interaction and automatic voice transcription.

One point to note is that if you are going to use voice recognition in a commercial way is that you use a key of the service that you are going to use in order to avoid that your service is interrupted.

If you want to learn another kind of recognition this article may interest you face recognition 

Luis Estrada
Luis Estrada

We use cookies to give you the best experience. Cookie Policy