Thursday, June 10, 2010

Using Hadoop on a single node

After you download the hadoop software from Apache's website (http://hadoop.apache.org/common/releases.html), extract it to /usr/local.

Since this folder already contains the bin folder with the binary for Hadoop, there is little configuration to be done. The one thing that needs to be changed, however, is hadoop-env.sh.

Open up conf/hadoop-env.sh in Hadoop's directory (/usr/local/hadoop-*) in an editor (make sure to sudo it), uncomment the line with export JAVA_HOME=, and append to the end of that line the directory of java (in our case, it was /usr/lib/jvm/java-6-sun).

Finally, make an input folder in the Hadoop directory, put the input files in that folder, and run

sudo bin/hadoop jar hadoop-*-examples.jar grep input output 'WORDTOBEFOUND'

This command is specific to the word search example that comes with Hadoop.

No comments:

Post a Comment