Microsoft Azure AI Fundamentals (AI-900) Practice Exam

Disable ads (and more) with a membership for a one time $4.99 payment

Prepare for the Microsoft Azure AI Fundamentals certification with flashcards and multiple-choice questions. Enhance your understanding with helpful hints and explanations. Get ready for your certification success!

Practice this question and more.


To create both a training and validation dataset from an existing dataset in Azure Machine Learning, which module should be used?

  1. Combine Data

  2. Split Data

  3. Prepare Data

  4. Transform Data

The correct answer is: Split Data

The appropriate module to use for creating both a training and validation dataset from an existing dataset in Azure Machine Learning is indeed the Split Data module. This module is specifically designed to divide a single dataset into two or more subsets, which can include a training dataset to train the model and a validation dataset to evaluate its performance. When modeling, it is a best practice to split the data into these subsets to prevent overfitting and to assess how well the model generalizes to unseen data. The Split Data module allows users to specify the proportion of the data to be used for training and validation, ensuring that the model can be accurately trained and tested. Other modules like Combine Data focus on merging multiple datasets, Prepare Data is generally used for preprocessing and cleaning the data, and Transform Data emphasizes modifying data structures or features. While these functions are important in the data preparation workflow, they do not specifically facilitate the splitting of a dataset into training and validation sets, which is why they do not fit the needs described in the question.