![]() ![]() ![]() Note that the value for the fs.defaultFS configuration parameter needs to be set to hdfs:// HOSTNAME:8020, where the HOSTNAME is the name of the NameNode host, which happens to be the localhost in this case.Īdapt the instructions in Step 17 to similarly update the hdfs-site.xml file, which contains information specific to HDFS, including the replication factor, which is set to 1 in this case as it is a pseudo-distributed mode cluster: sudo vi /etc/hadoop/conf/hdfs-site.xml # add the following config between the # and tags: dfs.replication 1 Īdapt the instructions in Step 17 to similarly update the yarn-site.xml file, which contains information specific to YARN. Use the vi editor to update the core-site.xml file, which contains important information about the cluster, specifically the location of the namenode: $ sudo vi /etc/hadoop/conf/core-site.xml # add the following config between the # and tags: fs.defaultFS hdfs:// hadoopnode0 :8020 Now let’s configure a pseudo-distributed mode Hadoop cluster from your installation. If this runs correctly you should see output similar to the following. $ sudo -u hdfs bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar pi 16 1000Īs we have not started any daemons or initialized HDFS, this program runs in LocalJobRunner mode (recall that I discussed this in Hour 2, “Understanding the Hadoop Cluster Architecture”). Run the built in Pi Estimator example included with the Hadoop release. Ĭreate a logs directory for Hadoop: $ mkdir $HADOOP_HOME/logsĬreate users and groups for HDFS and YARN: $ sudo groupadd hadoopĬhange the group and permissions for the Hadoop release files: $ sudo chgrp -R hadoop /usr/share/hadoop.$ sudo ln -s $HADOOP_HOME/etc/hadoop/* /etc/hadoop/conf/ Substitute the correct path to your Java home directory as defined in Step 6.Ĭreate a symbolic link between the Hadoop configuration directory and the /etc/hadoop /conf directory created in Step 10: Unpack the Hadoop release, move it into a system directory, and set an environment variable from the Hadoop home directory: $ tar -xvf hadoop-2.7.2.tar.gzĬreate a directory which we will use as an alternative to the Hadoop configuration directory: $ sudo mkdir -p /etc/hadoop/confĬreate a mapred-site.xml file (I will discuss this later) in the Hadoop configuration directory: $ sudo cp $HADOOP_HOME/etc/hadoop/ $HADOOP_HOME/etc/hadoop/mapred-site.xmlĪdd JAVA_HOME environment variable to hadoop-env.sh (file used to source environment variables for Hadoop processes): $ sed -i "\$aexport JAVA_HOME=/ REPLACE_WITH_YOUR_JDK_PATH/" $HADOOP_HOME/etc/hadoop/hadoop-env.sh $ wget REPLACE_WITH_YOUR_MIRROR/hadoop-2.7.2.tar.gz We will use Hadoop version 2.7.2 for our example. You can obtain the link by selecting the binary option for the version of your choice at. ĭownload Hadoop from your nearest Apache download mirror.$ export JAVA_HOME=/usr/lib/jvm/ REPLACE_WITH_YOUR_PATH/ Locate the installation path for Java, and set the JAVA_HOME environment variable: In these cases it may not be necessary to install the JDK, or you may need to set up alternatives so you do not have conflicting Java versions. Note that depending upon which operating system you are deploying on, you may have a version of Java and a JDK installed already. OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode) If Java has been installed correctly you should see output similar to the following: java version "1.7.0_101" Test that Java has been successfully installed by running the following command: $ java -version We will install the OpenJDK, which will install both a JDK and JRE: $ sudo yum install java-1.7.0-openjdk-devel Run the sestatus command to ensure SELinux is not enabled: $ sestatus $ sudo sed -i "\$.disable_ipv6 = 1" /etc/nf Red Hat Enterprise Linux 7.2 (The installation steps would be similar using other Linux distributions such as Ubuntu)ĭisable SELinux (this is known to cause issues with Hadoop): $ sudo sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/configĭisable IPv6 (this is also known to cause issues with Hadoop): $ sudo sed -i "\$.disable_ipv6 = 1" /etc/nf In this exercise we will install a pseudo-distributed mode Hadoop cluster using the latest Hadoop release downloaded from .Īs this is a test cluster the following specifications will be used in our example: Try It Yourself: Installing Hadoop Using the Apache Release ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |