Java Hadoop MapReduce code for my Big Data Analytics Project
- JDK
- Hadoop
- Make sure HADOOP_CLASSPATH is correctly set to the tools.jar within jdkx.y/lib/
- Compile the code:
hadoop com.sun.tools.javac.Main File.java
- Create JAR file:
jar cf File.jar *.class
- Create input directory in Hadoop:
hadoop dfs -mkdir /dir
- Upload input file into Hadoop:
hadoop dfs -put input.csv /dir/input.txt
- Run the code:
hadoop jar File.jar File /dir/input.txt /dir/out.txt
- Check the output directory:
hadoop dfs -ls /dir/*
- Verify the output
hadoop dfs -cat /dir/out.txt/part-r-00000