How Reducer Aggregates Word Counts in MapReduce WordCount Job?

Home » Exam » How Reducer Aggregates Word Counts in MapReduce WordCount Job?

Table of Contents

What Does Reducer Sum in Classic Hadoop Word Count Example?
Question
Answer
Explanation

What Does Reducer Sum in Classic Hadoop Word Count Example?

In Hadoop’s Word Count, the reducer sums all mapper-emitted 1s per unique word after shuffle grouping, producing final <word, frequency> outputs essential for text frequency analysis.

Question

In the classic Word Count job, what is the reducer’s main role?

A. Deleting duplicate words from output
B. Summing counts for each unique word
C. Sorting words alphabetically
D. Splitting input text into tokens

Answer

B. Summing counts for each unique word

Explanation

In the classic Word Count job, the reducer receives grouped intermediate key-value pairs from all mappers (e.g., <“apple”, > after shuffle/sort), where each unique word serves as the key and an iterable list of 1s represents occurrences from across the dataset. The reducer iterates through these values, summing them into a total count (e.g., sum() → 3), then emits the final output pair <word, total_count> written to HDFS. This aggregation step completes the distributed counting process, leveraging Hadoop’s automatic grouping by key; token splitting happens in the mapper, sorting is framework-managed during shuffle, and duplicates aren’t explicitly deleted since identical keys are inherently grouped.