AI-900: How does object detection find and classify multiple items within a single picture?

Home » Exam » AI-900 » AI-900: How does object detection find and classify multiple items within a single picture?

Table of Contents

Which computer vision task identifies and locates multiple items in an image?
Question
Answer
Explanation
Understanding Object Detection
Why Other Options Are Incorrect

Which computer vision task identifies and locates multiple items in an image?

Prepare for the AI-900 exam by learning why object detection is the correct computer vision task for identifying and locating multiple types of items in one image. Understand how it differs from image classification, OCR, and image description to master Azure AI concepts.

Question

______________ is used to identify and locate multiple types of items in one image.

A. Optical character recognition (OCR)
B. Image classification
C. Image description
D. Object detection

Answer

D. Object detection

Explanation

The correct answer is D. Object detection. This computer vision task is specifically designed to identify the presence, location, and type of one or more objects within an image.

Object detection is correct because it is used to identify and locate multiple types of items within a single image. Unlike image classification, which assigns a label to an entire image, object detection not only classifies objects but also determines their position in the image using bounding boxes. This makes it suitable for scenarios where you need to detect and locate multiple objects of different types within a single image.

Understanding Object Detection

Object detection is a two-part process. First, it identifies all the objects in an image that it has been trained to recognize. Second, it determines the location of each identified object, typically by outputting the coordinates of a bounding box that encloses it. This capability makes it ideal for analyzing complex scenes that contain multiple items of different types, such as identifying all the cars, pedestrians, and traffic lights in a street-view photograph. The output for each detected object includes a class label (e.g., “car”) and the bounding box coordinates.

Why Other Options Are Incorrect

A. Optical character recognition (OCR): This is a specialized task focused exclusively on detecting and extracting printed or handwritten text from an image. It does not identify non-textual objects.

B. Image classification: This is a simpler task that assigns a single label to an entire image. For example, it could classify an image as “contains a dog” or “beach scene,” but it does not specify the location of the dog or identify other objects that might be present.

C. Image description: This is a generative AI task that creates a human-readable sentence summarizing the content of the image. While the description might mention multiple items, its primary function is to create a caption, not to provide precise location data for each individual object in the way object detection does.

How does object detection find and classify multiple items within a single picture?

Microsoft Azure AI Fundamentals AI-900 certification exam practice question and answer (Q&A) dump with detail explanation and reference available free, helpful to pass the Microsoft Azure AI Fundamentals AI-900 exam and earn Microsoft Azure AI Fundamentals AI-900 certification.