Skip to Content

How Does the HDFS Client Retrieve Metadata From the NameNode?

What Is the First Step When Reading a File From HDFS?

Learn the first step of the HDFS file read operation for your Hadoop certification. Understand why the client must request metadata and block locations from the NameNode before it can stream data directly from DataNodes.

Question

What is the first step when reading a file from HDFS?

A. Client requests metadata from the NameNode
B. File is downloaded completely before use
C. Access DataNodes directly to fetch blocks
D. Client runs a MapReduce job automatically

Answer

A. Client requests metadata from the NameNode

Explanation

The first step when reading a file from the Hadoop Distributed File System (HDFS) is for the client to contact the NameNode to request the file’s metadata. By calling the open() method on the FileSystem object, the client triggers a remote procedure call (RPC) to the NameNode. The NameNode verifies that the file exists and that the client has the proper permissions to access it, and then it returns the locations (addresses) of the DataNodes that hold the first few blocks of that specific file. The client does not download the file completely before use, nor does it run a MapReduce job just to read a file; and it cannot access DataNodes directly until it first gets those block locations from the NameNode.