0% found this document useful (0 votes)
8 views3 pages

Practical 5

This document provides a step-by-step guide to install Hadoop 2.4.1 in pseudo distributed mode. It includes instructions for setting up Hadoop environment variables, configuring necessary files such as core-site.xml, hdfs-site.xml, yarn-site.xml, and mapred-site.xml, and applying changes to the system. The document emphasizes user-defined property values for customization according to the user's Hadoop infrastructure.

Uploaded by

tosam67394
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views3 pages

Practical 5

This document provides a step-by-step guide to install Hadoop 2.4.1 in pseudo distributed mode. It includes instructions for setting up Hadoop environment variables, configuring necessary files such as core-site.xml, hdfs-site.xml, yarn-site.xml, and mapred-site.xml, and applying changes to the system. The document emphasizes user-defined property values for customization according to the user's Hadoop infrastructure.

Uploaded by

tosam67394
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Installing Hadoop in Pseudo Distributed Mode

Follow the steps given below to install Hadoop 2.4.1 in pseudo distributed
mode.

Step 1 − Setting Up Hadoop


You can set Hadoop environment variables by appending the following
commands to ~/.bashrc file.
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME

export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_INSTALL=$HADOOP_HOME

Now apply all the changes into the current running system.

$ source ~/.bashrc

Step 2 − Hadoop Configuration


You can find all the Hadoop configuration files in the location
$HADOOP_HOME/etc/hadoop. It is required to make changes in those
configuration files according to your Hadoop infrastructure.

$ cd $HADOOP_HOME/etc/hadoop
In order to develop Hadoop programs in java, you have to reset the java
environment variables in hadoop-env.sh file by
replacing JAVA_HOME value with the location of java in your system.
export JAVA_HOME=/usr/local/jdk1.7.0_71

The following are the list of files that you have to edit to configure
Hadoop.

core-site.xml
The core-site.xml file contains information such as the port number used
for Hadoop instance, memory allocated for the file system, memory limit
for storing the data, and size of Read/Write buffers.
Open the core-site.xml and add the following properties in between
<configuration>, </configuration> tags.

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml
The hdfs-site.xml file contains information such as the value of
replication data, namenode path, and datanode paths of your local file
systems. It means the place where you want to store the Hadoop
infrastructure.

Let us assume the following data.

dfs.replication (data replication value) = 1

(In the below given path /hadoop/ is the user name.


hadoopinfra/hdfs/namenode is the directory created by hdfs file
system.)
namenode path = //home/hadoop/hadoopinfra/hdfs/namenode

(hadoopinfra/hdfs/datanode is the directory created by hdfs file


system.)
datanode path = //home/hadoop/hadoopinfra/hdfs/datanode

Open this file and add the following properties in between the
<configuration> &lt/configuration> tags in this file.

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoopinfra/hdfs/namenode
</value>
</property>

<property>
<name>dfs.data.dir&lt/name>
<value>file:///home/hadoop/hadoopinfra/hdfs/datanode
</value>
</property>
</configuration>
Note − In the above file, all the property values are user-defined and you
can make changes according to your Hadoop infrastructure.
yarn-site.xml

This file is used to configure yarn into Hadoop. Open the yarn-site.xml file
and add the following properties in between the <configuration>,
</configuration> tags in this file.

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle&lt/value>
</property>
</configuration>
mapred-site.xml
This file is used to specify which MapReduce framework we are using. By
default, Hadoop contains a template of yarn-site.xml. First of all, it is
required to copy the file from mapred-site.xml.template to mapred-
site.xml file using the following command.
$ cp mapred-site.xml.template mapred-site.xml
Open mapred-site.xml file and add the following properties in between
the <configuration>, </configuration>tags in this file.
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

You might also like