What Main Class Entry Goes in Hadoop JAR Manifest Files?
Hadoop JAR manifests specify the main class for MapReduce driver execution, enabling seamless job submission—learn its role in packaging Hive & Pig projects for big data certification success.
Question
What is typically included in a JAR manifest for a Hadoop project?
A. The main class to be executed when the JAR runs
B. The HDFS directory structure
C. Configuration of Hive tables
D. The mapper and reducer source code
Answer
A. The main class to be executed when the JAR runs
Explanation
In Hadoop projects, the JAR manifest file (MANIFEST.MF) typically includes the Main-Class attribute specifying the fully qualified name of the driver class containing the main() method, enabling direct execution via “hadoop jar myproject.jar” without explicitly naming the class on the command line. This entry point orchestrates MapReduce job configuration by setting Job parameters like input/output paths, mapper/reducer classes, data types (e.g., Text, IntWritable), number of reducers, and custom partitioners before submitting to the Hadoop cluster for distributed processing of large datasets such as customer complaints. Additional attributes like Class-Path may list dependencies, ensuring the JVM loads required libraries during execution, which is essential for packaging self-contained applications in Hive & Pig certification workflows.