Properly splitting data is crucial for evaluating machine learning models. Learn how Azure ML Designer’s Split Data module easily separates datasets into training and validation sets.
Table of Contents
Question
You need to create a training dataset and validation dataset from an existing dataset.
Which module in the Azure Machine Learning designer should you use?
A. Select Columns in Dataset
B. Add Rows
C. Split Data
D. Join Data
Answer
C. Split Data
Explanation
A common way of evaluating a model is to divide the data into a training and test set by using Split Data, and then validate the model on the training data.
Use the Split Data module to divide a dataset into two distinct sets.
The studio currently supports training/validation data splits
The correct answer is C. Split Data.
The Split Data module in the Azure Machine Learning designer allows you to divide a dataset into two distinct sets. This is useful when you need to separate data into training and validation sets for machine learning. You can customize the way that data is divided by choosing different splitting modes, such as Split Rows, Regular Expression Split, or Relative Expression Split.
The other options are not correct for the following reasons:
- Select Columns in Dataset: This module allows you to select a subset of columns from a dataset. It does not split the data into two sets.
- Add Rows: This module allows you to append rows from one dataset to another dataset. It does not split the data into two sets.
- Join Data: This module allows you to join two datasets based on a common key. It does not split the data into two sets.
Reference
Microsoft Learn > Azure > Machine Learning > Configure training, validation, cross-validation and test data in automated machine learning
Microsoft Azure AI Fundamentals AI-900 certification exam practice question and answer (Q&A) dump with detail explanation and reference available free, helpful to pass the Microsoft Azure AI Fundamentals AI-900 exam and earn Microsoft Azure AI Fundamentals AI-900 certification.