Skip to Content

AI-900: How does Azure AI Speech enable text-to-speech for a voice-controlled assistant?

Which Azure Cognitive Service provides spoken responses for voice apps?

Prepare for the AI-900 exam by learning why Azure AI Speech is the correct service for providing spoken responses. Understand its text-to-speech (TTS) capability and how it differs from Azure AI Language and Translator for building voice-controlled personal assistants.

Question

You plan to use Azure Cognitive Services to develop a voice-controlled personal assistant app. Which Azure Cognitive Service would you use to provide a spoken response to the user?

A. Azure AI Language service
B. Azure AI Translator
C. Azure AI Speech

Answer

C. Azure AI Speech

Explanation

The correct service to use is C. Azure AI Speech. This service provides the specific functionality needed to convert text into a spoken response.

Azure AI Speech is correct because the Azure Speech service provides capabilities for text-to-speech conversion. In the context of a voice-controlled personal assistant app, the text-to-speech feature of the Speech service would be used to convert the app’s textual response into audible speech to provide a spoken response to the user.

Understanding Text-to-Speech in Azure AI Speech

The Azure AI Speech service offers a comprehensive suite of capabilities for processing audio, including both speech-to-text and text-to-speech. To “provide a spoken response,” you need the text-to-speech (TTS) feature. This works as follows:

  1. Your application logic determines the response it wants to give to the user (e.g., “Your next appointment is at 3 PM”).
  2. This text string is sent to the Azure AI Speech service’s TTS API.
  3. The service synthesizes the text into a natural-sounding audio stream or file.
  4. The app then plays this audio back to the user as the spoken response.

Azure AI Speech offers a wide range of standard and highly realistic Neural Voices in various languages, genders, and speaking styles, making it ideal for creating a natural and engaging voice for a personal assistant.

Why Other Options Are Incorrect

A. Azure AI Language service: This service would be used to understand the user’s initial spoken command after it has been converted to text. Its features, like conversational language understanding, help determine the user’s intent, but it cannot generate the final spoken output.

B. Azure AI Translator: This service is used to translate text or speech from one language to another. While it can have a speech output for the translated text, its primary purpose is translation, not general-purpose text-to-speech for an application’s native language. The direct tool for generating speech from text is Azure AI Speech.

How does Azure AI Speech enable text-to-speech for a voice-controlled assistant?

Microsoft Azure AI Fundamentals AI-900 certification exam practice question and answer (Q&A) dump with detail explanation and reference available free, helpful to pass the Microsoft Azure AI Fundamentals AI-900 exam and earn Microsoft Azure AI Fundamentals AI-900 certification.