What Happens When Hadoop Writes a File to HDFS?
Learn the exact process of writing a file to HDFS for your Big Data certification. Understand how Hadoop splits files into blocks and replicates them across multiple DataNodes to guarantee fault tolerance and high availability.
Question
What occurs when Hadoop writes a file to HDFS?
A. File is split into blocks and replicated across multiple DataNodes
B. File is permanently stored in NameNode
C. File is converted into SQL tables
D. File is sent to a single DataNode without backup
Answer
A. File is split into blocks and replicated across multiple DataNodes
Explanation
When a file is written to the Hadoop Distributed File System (HDFS), the client first contacts the NameNode to get permission and locations. Once approved, the file is not kept whole; instead, it is divided into fixed-size chunks called blocks (typically 128 MB each). The client then streams these blocks directly to the DataNodes, where Hadoop automatically replicates each block (usually three times by default) across different nodes and racks to ensure high availability and fault tolerance. The file is never permanently stored in the NameNode (which only holds metadata), it is not converted into SQL tables, and it is never sent to just a single DataNode without backup.