Learn how to use the Split Data module in the Azure Machine Learning designer to create a training dataset and validation dataset from an existing dataset. This module is essential for building and testing machine learning models.
Table of Contents
Question
You need to create a training dataset and validation dataset from an existing dataset. Which module in the Azure Machine Learning designer should you use?
A. Select Columns in Dataset
B. Add Rows
C. Split Data
D. Join Data
Answer
C. Split Data
Explanation
The correct answer is C. Split Data.
A common way of evaluating a model is to divide the data into a training and test set by using Split Data, and then validate the model on the training data. Use the Split Data module to divide a dataset into two distinct sets. The studio currently supports training/validation data splits.
The Split Data module in the Azure Machine Learning designer is used to divide a dataset into two parts: one for training the model and the other for validating the model. This module allows you to specify the percentage of data to put in each part, and whether to randomize the selection of rows or use stratified sampling. You can also choose different splitting modes, such as regular expression split or relative expression split, to apply a condition to a column of data.
The other modules are not suitable for creating a training dataset and validation dataset from an existing dataset. The Select Columns in Dataset module is used to select specific columns from a dataset. The Add Rows module is used to append rows from one dataset to another. The Join Data module is used to combine two datasets into one.
Microsoft Azure AI Fundamentals AI-900 certification exam practice question and answer (Q&A) dump with detail explanation and reference available free, helpful to pass the Microsoft Azure AI Fundamentals AI-900 exam and earn Microsoft Azure AI Fundamentals AI-900 certification.