Why Does HDFS Replicate Data Blocks for Fault Tolerance and High Availability?

Home » Exam » Why Does HDFS Replicate Data Blocks for Fault Tolerance and High Availability?

Table of Contents

What Is the Purpose of HDFS Block Replication Factor in Hadoop Storage?
Question
Answer
Explanation

What Is the Purpose of HDFS Block Replication Factor in Hadoop Storage?

Learn why HDFS replicates data blocks in Hadoop. Understand how block replication improves fault tolerance and high availability by keeping multiple copies across DataNodes, ensuring data stays accessible during failures.

Question

Why does HDFS replicate data blocks?

A. To reduce the size of the datasetTo reduce the size of the dataset
B. To support SQL-style queries
C. To increase the speed of computation
D. To provide fault tolerance and high availability of data

Answer

D. To provide fault tolerance and high availability of data

Explanation

HDFS replicates each data block across multiple DataNodes so the file remains accessible even if a disk, node, or network path fails. By keeping multiple copies (commonly a default replication factor such as 3), HDFS can transparently read from another replica and maintain data availability while the system replaces missing replicas in the background.