Skip to Content

AI-900: Which Computer Vision System Identifies Objects with Bounding Boxes?

Confused about computer vision systems on the AI-900 exam? Discover how object detection works with bounding boxes and classes—key insights for passing Microsoft’s Azure AI certification. Learn why bounding boxes matter for Azure AI fundamentals exam success.

Table of Contents

Question

Which of the following computer vision systems returns the class of every object in an image as well as the bounding boxes for each object?

A. Image classification
B. Optical character recognition
C. Image captioning
D. Object detection

Answer

D. Object detection

Explanation

Object detection returns the class of every object in an image as well as the bounding boxes for each object. This system is specifically designed to identify and localize multiple objects within an image. It returns both the class (type) of each object and the bounding boxes around them, accurately matching the functionalities described in the prompt. Object detection is a technique in computer vision that identifies individual objects within an image and returns the following information for each object:

  • Class (category): The type of object (e.g., person, car, dog).
  • Probability score: The confidence level of the classification (how likely it is that the prediction is correct).

Bounding box: A rectangular shape defined by coordinates that indicates the object’s location in the image. While image classification answers the question, “What is the main item in the image?” (e.g., “beach”), object detection answers the questions, “What objects are in the image?” (e.g., “person,” “umbrella”), and “Where are they located?” (bounding boxes). As an example, imagine an image of a beach with a person holding an umbrella:

  • Image classification: Might classify the image as “beach scene.”
  • Object detection: Would identify two objects: “person” and “umbrella,” along with their locations and the confidence scores for each classification.

Object detection allows computers not only to recognize objects in images but also to pinpoint their exact locations.

Image classification only identifies the dominant or most likely class for the entire image, not individual objects and their locations.

Optical character recognition (OCR) technology focuses on extracting text from images, not on object recognition or localization.

While image captioning is related to computer vision, it does not directly return the class or bounding boxes for individual objects. Instead, it focuses on generating a descriptive sentence summarizing the content of the entire image, not providing precise object identification or location information.

Microsoft Azure AI Fundamentals AI-900 certification exam practice question and answer (Q&A) dump

Microsoft Azure AI Fundamentals AI-900 certification exam practice question and answer (Q&A) dump with detail explanation and reference available free, helpful to pass the Microsoft Azure AI Fundamentals AI-900 exam and earn Microsoft Azure AI Fundamentals AI-900 certification.