Skip to content

Instantly share code, notes, and snippets.

@hadoopversity
Created February 13, 2018 02:54
Show Gist options
  • Save hadoopversity/651036414f3f87e9afe7380b8b2eaf8d to your computer and use it in GitHub Desktop.
Save hadoopversity/651036414f3f87e9afe7380b8b2eaf8d to your computer and use it in GitHub Desktop.
This file contains the step by step procedure to install HDFS, yarn,pig, sqoop,hive, oozie and hue in single node centos
Cloudera Manual Installation - Centos 6
Centos-6,64-bit
Hadoop Components
1. HDFS
2. Yarn
3. Pig
4. Sqoop
5. Hive
6. Oozie
7. Hue
Prerequisite:
Switch to sudo user
Steps to Install java
1. Get Oracle JDK 8
Visit Oracle JDK download page, look for RPM version
http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
Copy the download link of jdk-8u102-linux-x64.rpm and wget it
wget --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u102-b14/jdk-8u102-linux-x64.rpm
2. Install Oracle JDK 8
sudo yum localinstall jdk-8u102-linux-x64.rpm
3. Set JAVA_HOME Environment Variables
export JRE_HOME=/usr/java/jdk1.8.0_161/jre
export PATH=$PATH:$JRE_HOME/bin
export JAVA_HOME=/usr/java/jdk1.8.0_161
export JAVA_PATH=$JAVA_HOME
export PATH=$PATH:$JAVA_HOME/bin
create java.sh file under /etc/profile.d/ and add the above export commands
4. Verification
cd /usr/java
ls -lsah
java -version
source .bash_profile
$ echo $JRE_HOME
/usr/java/jdk1.8.0_102/jre
$ echo $JAVA_HOME
/usr/java/jdk1.8.0_102/
Steps to Install CDH
Link: https://www.cloudera.com/documentation/enterprise/5-12-x/topics/cdh_ig_cdh5_install.html#topic_4_4_1__p_32
1. Add CDH to repository
[cloudera-cdh5]
# Packages for Cloudera's Distribution for Hadoop, Version 5, on RedHat or CentOS 6 x86_64
name=Cloudera's Distribution for Hadoop, Version 5
baseurl=https://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5/
gpgkey =https://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
gpgcheck = 1
save file as cloudera.repo to /etc/yum.repos.d/
2. Optionally Add a Repository Key
sudo rpm --import https://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
3. Install CDH 5 with YARN
Maser Deamons:
1. Resource Manager
sudo yum clean all;
sudo yum install hadoop-yarn-resourcemanager
2. Namenode
sudo yum clean all;
sudo yum install hadoop-hdfs-namenode
3. Secondary Namenode
sudo yum clean all;
sudo yum install hadoop-hdfs-secondarynamenode
Slave Deamons:
4. Nodemanager, Datanode and Mapreduce
sudo yum clean all;
sudo yum install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce
5. History server and proxy server
sudo yum clean all;
sudo yum install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver
6. Hadoop Client
sudo yum clean all; sudo yum install hadoop-client
7. Customizing Configuration Files
core-site.xml:
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:8020</value>
</property>
8. Configuring Local Storage Directories
Sample configuration:
hdfs-site.xml on the NameNode:
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///data/1/dfs/nn,file:///nfsmount/dfs/nn</value>
</property>
hdfs-site.xml on each DataNode:
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///data/1/dfs/dn,file:///data/2/dfs/dn,file:///data/3/dfs/dn,file:///data/4/dfs/dn</value>
</property>
On a NameNode host: create the dfs.name.dir or dfs.namenode.name.dir local
directories
$ sudo mkdir -p /data/1/dfs/nn /nfsmount/dfs/nn
$ sudo mkdir -p /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn
sudo chown -R hdfs:hdfs /data/1/dfs/nn /nfsmount/dfs/nn /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn
9. Formatting the NameNode
sudo -u hdfs hdfs namenode -format
10. Start HDFS
for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done
11. Create the /tmp Directory
$ sudo -u hdfs hadoop fs -mkdir /tmp
$ sudo -u hdfs hadoop fs -chmod -R 1777 /tmp
12. Yarn configuration
Edit /etc/hadoop/conf/yarn-site.xml and add the below lines
<property>
<name>yarn.resourcemanager.address</name>
<value>127.0.0.1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>127.0.0.1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>127.0.0.1:8031</value>
</property>
Start yarn resource and node manager
/etc/init.d/hadoop-yarn-nodemanager start
/etc/init.d/hadoop-yarn-resourcemanager restart
4. Create users
sudo -u hdfs hdfs dfs -mkdir /user
sudo -u hdfs hdfs dfs -mkdir /user/hadoop
sudo -u hdfs hdfs dfs -chown hdfs /user
sudo -u hdfs hdfs dfs -chown hdfs /user/hadoop
5. Installing Pig
sudo yum install pig
6. Installing Hive
sudo yum install hive
sudo yum install hive-metastore
sudo yum install hive-server2
Configuring the Hive Metastore
sudo yum install mysql-server
sudo service mysqld start
sudo yum install mysql-connector-java
ln -s /usr/share/java/mysql-connector-java.jar /usr/lib/hive/lib/mysql-connector-java.jar
Setup MySQL
Configure MySQL to use a strong password and to start at boot. Note that in the following procedure, your current root password is blank. Press the Enter key when you're prompted for the root password.
$ sudo /usr/bin/mysql_secure_installation
[...]
Enter current password for root (enter for none):
OK, successfully used password, moving on...
[...]
Set root password? [Y/n] y
New password:
Re-enter new password:
Remove anonymous users? [Y/n] Y
[...]
Disallow root login remotely? [Y/n] N
[...]
Remove test database and access to it [Y/n] Y
[...]
Reload privilege tables now? [Y/n] Y
All done!
Create the Database and User
$ mysql -u root -p
Enter password:
mysql> CREATE DATABASE metastore;
mysql> USE metastore;
mysql> SOURCE /usr/lib/hive/scripts/metastore/upgrade/mysql/hive-schema-1.1.0.mysql.sql;
mysql> CREATE USER 'hive'@'localhost' IDENTIFIED BY 'mypassword';
...
mysql> REVOKE ALL PRIVILEGES, GRANT OPTION FROM 'hive'@'localhost';
mysql> GRANT SELECT,INSERT,UPDATE,DELETE,LOCK TABLES,EXECUTE ON metastore.* TO 'hive'@'localhost';
mysql> FLUSH PRIVILEGES;
mysql> quit;
Configure the Metastore Service to Communicate with the MySQL Database
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore</value>
<description>the URL of the MySQL database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>mypassword</value>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
</property>
<property>
<name>datanucleus.fixedDatastore</name>
<value>true</value>
</property>
<property>
<name>datanucleus.autoStartMechanism</name>
<value>SchemaTable</value>
</property>
<property>
<name>hive.support.concurrency</name>
<description>Enable Hive's Table Lock Manager Service</description>
<value>true</value>
</property>
<property>
<name>hive.zookeeper.quorum</name>
<description>Zookeeper quorum used by Hive's Table Lock Manager</description>
<value>zk1.myco.com,zk2.myco.com,zk3.myco.com</value>
</property>
<property>
<name>hive.zookeeper.client.port</name>
<value>2222</value>
<description>
The port at which the clients will connect.
</description>
</property>
start services
sudo service hive-metastore start
sudo service hive-server2 start
7. Installing Sqoop
sudo yum install sqoop
Installing the JDBC Drivers for Sqoop 1
mkdir -p /var/lib/sqoop
chown sqoop:sqoop /var/lib/sqoop
chmod 755 /var/lib/sqoop
cp /usr/share/java/mysql-connector-java.jar /var/lib/sqoop/
8. Creating users
sudo -u hdfs dfs -mkdir /root
sudo -u hdfs dfs -chown -R root:hadoopusers /root
sudo -u hdfs hadoop fs -mkdir /user
sudo -u hdfs hadoop fs -mkdir /user/root
sudo -u hdfs hadoop fs -chown root /user
sudo -u hdfs hadoop fs -chown root /user/root
9. Installing Oozie
sudo yum install oozie
sudo yum install oozie-client
Configuring Oozie to Use MySQL
$ mysql -u root -p
Enter password:
mysql> create database oozie default character set utf8;
Query OK, 1 row affected (0.00 sec)
mysql> grant all privileges on oozie.* to 'oozie'@'localhost' identified by 'oozie';
Query OK, 0 rows affected (0.00 sec)
mysql> grant all privileges on oozie.* to 'oozie'@'%' identified by 'oozie';
Query OK, 0 rows affected (0.00 sec)
mysql> exit
Bye
Configure Oozie to use MySQL.
Edit properties in the oozie-site.xml file as follows:
...
<property>
<name>oozie.service.JPAService.jdbc.driver</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.url</name>
<value>jdbc:mysql://localhost:3306/oozie</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.username</name>
<value>oozie</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.password</name>
<value>oozie</value>
</property>
ln -s /usr/share/java/mysql-connector-java.jar /var/lib/oozie/mysql-connector-java.jar
Create oozie user
sudo groupadd cdh-hadoop
sudo useradd -g chd-hadoop oozie
hdfs dfs -mkdir /user/oozie
sudo -u hdfs hdfs dfs -mkdir /user1
sudo -u hdfs hadoop fs -chown oozie /user/oozie
sudo -u oozie /usr/lib/oozie/bin/ooziedb.sh create -run
wget http://archive.cloudera.com/gplextras/misc/ext-2.2.zip
unzip ext-2.2.zip -d /var/lib/oozie/
To install the Oozie shared library in Hadoop HDFS in the oozie user home directory
sudo -u hdfs hadoop fs -chown oozie:oozie /user/oozie
sudo oozie-setup sharelib create -fs hdfs://localhost:8020 -locallib /usr/lib/oozie/oozie-sharelib-yarn
10. Installing Hue
sudo yum install hue
sudo yum install hue-plugins
Configure Hue
HDFS
add following in hdfs-site.xml
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
core-site.xml
<property>
<name>hadoop.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hue.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.httpfs.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.httpfs.groups</name>
<value>*</value>
</property>
restart hadoop
for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x restart ; done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment