Skip to Content

AI-102: How to Use Azure Speech SDK for One-Shot Speech Recognition?

Master one-shot speech recognition with Azure Speech SDK for the AI-102 certification. Learn how to configure audio files and language settings for your exam success.

Table of Contents

Question

Xerigon Corporation is developing a speech to text solution that will use the Speech SDK to perform one-shot speech recognition using input from an audio file. It must create a speech recognizer using a wave file named SportsReporterDescription.wav and specify English as the language.

.
The following code is part of the solution:
"""performs one-shot speech recognition with input from an audio file"""
# <SpeechRecognitionWithFile>
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
audio_config = speechsdk.audio.A(filename=”SportsReporterDescription.wav”)
speech_recognizer = speechsdk.B
(
speech_config=speech_config, language="en-US", audio_config=audio_config)
result = speech_recognizer.recognize_once()

Which methods are missing from the code?

Drag the appropriate code to the corresponding letter.

Code:

  • GetWaveFormatPCM
  • AudioConfig
  • AudioStreamFormat
  • KeywordRecognizer
  • SpeakerRecognizer
  • SpeechRecognizer

Answer

A. AudioConfig
B. SpeechRecognizer

Explanation

The following code uses the AudioConfig method to define the source of the audio and the SpeechRecognizer method to convert spoken language to text.

"""performs one-shot speech recognition with input from an audio file"""
# <SpeechRecognitionWithFile>
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
audio_config = speechsdk.audio.AudioConfig(filename=”SportsReporterDescription.wav”)
# Creates a speech recognizer using a file as audio input, also specify the speech language
speech_recognizer = speechsdk.SpeechRecognizer(
speech_config=speech_config, language="en-US", audio_config=audio_config)
result = speech_recognizer.recognize_once()

The AudioConfig method defines the source of the audio which can be an audio file or a microphone. The audio file can be a .wav, .mp3, or any other media file.

The SpeechRecognizer method converts spoken language to text. This method provides speech recognition from a microphone, audio files, and audio streams.

You would not choose the AudioStreamFormat method. This method of the Speech SDK defines the audio format for input streams of audio configurations from streaming .mp3 and .mp4 files. You can configure the sample rate, bits per sample, and the number of channels.

You would not choose the GetWaveFormatPCM method. This method in the Speech SDK creates an AudioStreamFormat object with specific Pulse Code Modulation (PCM) wave format characteristics as the sample rate, bits per sample, and the number of channels.

You would not choose the KeywordRecognizer method. This method recognizes certain keywords or phrases within an audio stream and provides voice activation.

You would not choose the SpeakerRecognizer method. This method identifies and verifies the speaker based on voice characteristics.

Microsoft Azure AI Engineer Associate AI-102 certification exam practice question and answer (Q&A) dump with detail explanation and reference available free, helpful to pass the Microsoft Azure AI Engineer Associate AI-102 exam and earn Microsoft Azure AI Engineer Associate AI-102 certification.