AI-900: How do you choose between classification and clustering for a dataset with defined labels in Azure ML?

Home » Exam » AI-900 » AI-900: How do you choose between classification and clustering for a dataset with defined labels in Azure ML?

Table of Contents

Why is classification the right machine learning type for categorizing labeled customer data?
Question
Answer
Explanation
Understanding Classification
Why Other Options Are Incorrect

Why is classification the right machine learning type for categorizing labeled customer data?

Prepare for the AI-900 exam by learning why classification is the correct machine learning type when you need to categorize customer types using sales data with defined labels. Understand the difference between classification (supervised learning with labels) and clustering (unsupervised learning without labels).

Question

You have a dataset that contains sales data and has defined labels for types of customers. You need to categorize the customer types based on the sales data. Which type of machine learning should you use?

A. Translating
B. Classification
C. Regression
D. Clustering

Answer

B. Classification

Explanation

The correct type of machine learning to use is B. Classification. This is because the problem involves predicting a discrete, predefined category (customer type) based on input features, and the dataset already contains known labels.

Understanding Classification

Classification is a type of supervised learning. The term “supervised” is key here, as it means the model is trained on a dataset that includes both the input features (the sales data) and the correct output labels (the defined “types of customers”). The algorithm learns the patterns that map the sales data to a specific customer category. Once trained, the model can take sales data from a new, uncategorized customer and predict which of the predefined categories they belong to. The output is a specific class label, such as “High-Value,” “Occasional,” or “New Customer.”

Why Other Options Are Incorrect

A. Translating: This is a natural language processing task for converting text from one language to another and is not relevant to this data analysis scenario.

C. Regression: This supervised learning technique is used to predict a continuous numerical value, not a category. You would use regression if the goal was to predict the amount of future sales a customer might generate, rather than their type.

D. Clustering: This is a form of unsupervised learning, which you would use if your dataset did not have defined labels. A clustering algorithm would group customers into segments based on similarities in their sales data, but it would be up to a data analyst to interpret and name those automatically generated groups. Since the labels are already known, classification is the appropriate supervised method.

How do you choose between classification and clustering for a dataset with defined labels in Azure ML?

Microsoft Azure AI Fundamentals AI-900 certification exam practice question and answer (Q&A) dump with detail explanation and reference available free, helpful to pass the Microsoft Azure AI Fundamentals AI-900 exam and earn Microsoft Azure AI Fundamentals AI-900 certification.