Skip to Content

How Do You Check If Different Hiring Recommendation Rates Signal Model Bias?

What Fairness Metrics Should You Compare When Hiring AI Favors One Group?

Learn how to evaluate whether different hiring recommendation rates indicate bias by comparing true positive and false positive rates across demographic groups.

Question

A hiring recommendation model shows a 45% positive recommendation rate for Group A and a 38% positive recommendation rate for Group B. What additional analysis would best help you determine whether this difference represents problematic bias?

A. Calculate the model’s overall accuracy—if it’s above 90%, the model is fair.
B. Calculate true positive rates and false positive rates separately for each group to assess equal opportunity.
C. Remove demographic information from the training data and retrain the model.
D. Run SHAP analysis to find the most important features overall.

Answer

B. Calculate true positive rates and false positive rates separately for each group to assess equal opportunity.

Explanation

A 45% versus 38% recommendation rate shows a difference in outcomes, but it does not by itself prove harmful bias. To judge whether the disparity is problematic, you should also compare true positive rates and false positive rates for each group.

That analysis helps you test fairness more directly. If one group has a lower true positive rate, qualified candidates from that group may be missed more often; if one group has a higher false positive rate, that group may be judged less accurately. These group-level error patterns are central to fairness checks such as equal opportunity and equalized odds.

Why the others are wrong

A is incorrect because high overall accuracy does not guarantee fairness across groups. A model can be highly accurate overall while still producing uneven error rates for different groups.

C may be a later mitigation step, but it does not answer the immediate question of whether the observed recommendation gap reflects problematic bias. You first need a proper fairness assessment.

D can help explain which features influence predictions, but overall feature importance does not directly tell you whether the model’s error rates differ unfairly between groups.