Table of Contents
Why Must Reducer Be Associative Commutative for Combiner Use?
Learn why associative and commutative properties are required for safely using a Hadoop reducer as a combiner, ensuring correct aggregation in MapReduce jobs like sum or max operations.
Question
Which property must hold true for a reducer function to be safely used as a combiner?
A. It must run faster than the mapper
B. It must run without Hadoop configuration
C. It must be associative and commutative
D. It must replace the partitioner function
Answer
C. It must be associative and commutative
Explanation
A reducer function can be safely reused as a combiner only if it satisfies both associative and commutative properties, meaning the order of input values doesn’t affect the final aggregation result (e.g., sum(1+2+3) = sum(sum(1+2)+3) and works regardless of value sequence). This ensures local combiner aggregation on mapper output produces identical results to full reducer processing across the network, preventing incorrect outcomes like in averaging operations where partial sums/counts must combine accurately (sum(a,b,c) = sum(sum(a,b),c)). Non-commutative/associative functions like finding unique lists or averages requiring total counts fail as combiners, while operations like sum, max, min, or multiplication work reliably.