Skip to Content

IBM AI Fundamentals: Identify Sensitive Personal Information (SPI) in Patient Data Sets for AI

Learn how to determine which patient data qualifies as sensitive personal information (SPI) that must be removed when preparing data sets for AI systems. Covers BMI, name, marital status, income range. Essential knowledge for the IBM AI Fundamentals certification exam.

Table of Contents

Question

While preparing a data set for an AI system dealing with patient data, Nora needs to determine which of the following information qualifies as sensitive personal information (SPI) and needs to be removed from the data set.

Which of the following would be considered SPI?

A. Body Mass Index (BMI)
B. Patient name
C. Marital status
D. Annual income range

Answer

B. Patient name

Explanation

Consider that while all information carries a risk, there is one piece of information listed that, if compromised, could be used for unauthorized access or harm to an individual.

Of the options provided, Patient name (B) would be considered sensitive personal information (SPI) that needs to be removed from the data set before using it to train an AI system.

SPI refers to information that can be used to identify a specific individual, either on its own or in combination with other data. A patient’s name clearly falls into this category, as it directly identifies who the patient is. Including names in the data set would be a breach of patient privacy.

The other options would not necessarily be considered SPI:

A. Body Mass Index (BMI) is a health metric that does not inherently identify an individual patient. Many patients can have the same BMI.

C. Marital status, while somewhat personal, applies to large segments of the population and does not uniquely identify individual patients.

D. Annual income range is also not unique to individuals and covers broad population segments. Income data may be useful for analysis without compromising patient privacy.

In summary, patient names are the clear example of sensitive personal information among the options presented, and should be excluded from data sets used to train AI systems in order to protect patient privacy. The other data types can likely be included as long as they are not combined with other fields that could identify individuals.

IBM Artificial Intelligence Fundamentals certification exam practice question and answer (Q&A) dump with detail explanation and reference available free, helpful to pass the Artificial Intelligence Fundamentals graded quizzes and final assessments, earn IBM Artificial Intelligence Fundamentals digital credential and badge.