Skip to Content

AI-900: How to Split Data in Azure Machine Learning Designer

Learn how to use the Split Data module in the Azure Machine Learning designer to create training and validation datasets for machine learning.

Table of Contents

Question

Which module in the Azure Machine Learning designer should you use if you want to create a training dataset and a validation dataset from an existing dataset?

A. Add rows
B. Split data
C. Select columns in dataset
D. Join data

Answer

B. Split data

Explanation

Datasets can be split into training datasets and validation datasets by splitting the data.

The correct answer is B. Split data. The Split Data module in the Azure Machine Learning designer allows you to divide a dataset into two distinct sets. This is useful when you need to separate data into training and validation sets for machine learning. You can specify the percentage of data to put in each set, and whether to randomize the selection of rows or use stratified sampling based on a column value. For example, if you want to split your data 80/20 into a training set and a validation set, you can set the Fraction of rows in the first output dataset option to 0.8, and select the Randomized split option. This will create two output datasets, one with 80% of the rows and the other with 20% of the rows, chosen randomly from the original dataset. You can then use the training set to train your model, and the validation set to evaluate its performance.

Microsoft Azure AI Fundamentals AI-900 certification exam practice question and answer (Q&A) dump with detail explanation and reference available free, helpful to pass the Microsoft Azure AI Fundamentals AI-900 exam and earn Microsoft Azure AI Fundamentals AI-900 certification.

Microsoft Azure AI Fundamentals AI-900 certification exam practice question and answer (Q&A) dump