Skip to content

Instantly share code, notes, and snippets.

@blanklin030
Created October 17, 2018 03:50
Show Gist options
  • Save blanklin030/826ea13f4dde78d4ed53c1dbe96aa2a0 to your computer and use it in GitHub Desktop.
Save blanklin030/826ea13f4dde78d4ed53c1dbe96aa2a0 to your computer and use it in GitHub Desktop.
centos7下安装Hadoop2.6

centos7下安装Hadoop2.6

  • 假设当前有三台机器,分别是
  • 192.168.33.100 hadoop-master
  • 192.168.33.200 hadoop-slave01
  • 192.168.33.300 hadoop-slave02

配置系统环境

3台机器均要做以下操作

  • 配置host
vim /etc/hosts
//复制以下3行内容
192.168.33.100 hadoop-master
192.168.33.200 hadoop-slave01
192.168.33.300 hadoop-slave02
//保存后退出
  • 添加Hadoop用户
useradd hadoop
//给hadoop用户添加密码:hadoop
passwd hadoop
  • 在hadoop-master上无密码登录hadoop-slave01和hadoop-slave02
//当前是root账户,以下操作需要输入机器密码
ssh-keygen -t rsa
ssh-copy-id root@hadoop-slave01
ssh-copy-id root@hadoop-slave02
//当前是hadoop账户,以下操作需要输入机器密码
su hadoop
ssh-keygen -t rsa
ssh-copy-id hadoop@hadoop-slave01
ssh-copy-id hadoop@hadoop-slave02
wget https://archive.apache.org/dist/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
tar zxvf hadoop-2.6.0.tar.gz
mv hadoop-2.6.0 /usr/hadoop
//在hadoop-master机器上下载后复制到其他两台机器
scp -r /usr/hadoop root@192.168.33.200:/usr
scp -r /usr/hadoop root@192.168.33.300:/usr
  • 安装java环境
yum install openjdk
vim /etc/profile
//复制下面6行内容
export JAVA_HOME=/usr/lib/jvm/java-openjdk
export JRE_HOME=/usr/lib/jvm/java-openjdk/jre
export CLASSPATH=.:${CLASSPATH}:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${PATH}:${JAVA_HOME}/bin:${JRE_HOME}/bin
export HADOOP_HOME=/usr/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
//保存后退出profile文件,让配置生效
source /etc/profile

配置hadoop

在hadoop-master上操作下面步骤

  • 新建tmp目录
mkdir /usr/hadoop/tmp
  • 配置hadoop-env.sh
vim /usr/hadoop/etc/hadoop/hadoop-env.sh
//找到export JAVA_HOME=${JAVA_HOME},修改为
export JAVA_HOME=/usr/lib/jvm/java-openjdk
  • 配置core-site.xml

按照下面的内容配置 Master (NameNode/ResourceManager) 的地址和端口号,进行下面的配置前一定要创建 /usr/hadoop/tmp 目录。fs.defaultFS 用来指定 ResourceManager 设备

vim /usr/hadoop/etc/hadoop/core-site.xml
//将下面内容复制进去后保存
<configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/hadoop/tmp</value>
        <description>A base for other temporary directories.</description>
    </property>

    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop-master</value>
    </property>

    <!-- file system properties -->
    <property>
        <name>fs.default.name</name>
        <value>hdfs://hadoop-master:9000</value>
    </property>

</configuration>

上面的 hadoop.tmp.dir 如果不配置的话默认就会用临时目录 /tmp/hadoo-hadoop 。这个目录每次重启后都会被删掉,必须重新执行format才行,否则会出错。

  • 配置hdfs-site.xml

下面配置中的 dfs.replication 参数用来指定每个文件保存多少份,对于普通使用来说 1 份就可以,但这种没有备份的情况可能一个 DataNode 损坏就丢失文件了,如果文件比较重要最好还是多备份一份,这里设置为 2 是因为我们有 2 台 DataNode 正好每个备份一个,如果配置的 DataNode 数量不够会报警。如果不配置这个参数那么默认是 3

vim /usr/hadoop/etc/hadoop/hdfs-site.xml
//将下面内容复制进去然后保存
<configuration>
  <property>
      <name>dfs.replication</name>
      <value>2</value>
  </property>
</configuration>
  • 主机配置
vim /usr/hadoop/etc/hadoop/masters 
//将下面1行复制然后保存
hadoop-master

vim /usr/hadoop/etc/hadoop/slaves
//将下面2行复制然后保存
hadoop-slave01
hadoop-slave02
  • 复制上面配置到其他两台机器
scp -r /usr/hadoop/tmp root@hadoop-slave01:/usr/hadoop
scp -r /usr/hadoop/etc/hadoop/hadoop-env.sh root@hadoop-slave01:/usr/hadoop/etc/hadoop
scp -r /usr/hadoop/etc/hadoop/core-site.xml root@hadoop-slave01:/usr/hadoop/etc/hadoop
scp -r /usr/hadoop/etc/hadoop/hdfs-site.xml root@hadoop-slave01:/usr/hadoop/etc/hadoop
scp -r /usr/hadoop/etc/hadoop/masters root@hadoop-slave01:/usr/hadoop/etc/hadoop
scp -r /usr/hadoop/etc/hadoop/slaves root@hadoop-slave01:/usr/hadoop/etc/hadoop

scp -r /usr/hadoop/tmp root@hadoop-slave02:/usr/hadoop
scp -r /usr/hadoop/etc/hadoop/hadoop-env.sh root@hadoop-slave02:/usr/hadoop/etc/hadoop
scp -r /usr/hadoop/etc/hadoop/core-site.xml root@hadoop-slave02:/usr/hadoop/etc/hadoop
scp -r /usr/hadoop/etc/hadoop/hdfs-site.xml root@hadoop-slave02:/usr/hadoop/etc/hadoop
scp -r /usr/hadoop/etc/hadoop/masters root@hadoop-slave02:/usr/hadoop/etc/hadoop
scp -r /usr/hadoop/etc/hadoop/slaves root@hadoop-slave02:/usr/hadoop/etc/hadoop
  • 修改权限,三台机器都需要
chown -R hadoop:hadoop /usr/hadoop
chown -R 755 /usr/hadoop

启动hadoop

以下操作均需要hadoop账户,在hadoop-master机器,目录:/usr/hadoop

  • 格式化 HDFS 系统
hadoop namenode -format
  • 启动 hadoop
./sbin/start-dfs.sh
./sbin/start-yarn.sh
  • 验证是否启动

使用 java 的 jps 小工具可以看到 ResourceManager , NameNode 都启动

# On Master
[hadoop@Master hadoop]$ jps
29011 ResourceManager
21836 NameNode
2159 Jps
32261 SecondaryNameNode

# On Slave
[root@Salve_1 hadoop]# jps
4799 DataNode
20182 Jps
11010 NodeManager
  • 用 ./bin/hdfs dfsadmin -report 查看状态
# On Master
[hadoop@Master hadoop]$ ./bin/hdfs dfsadmin -report
Configured Capacity: 107321753600 (99.95 GB)
Present Capacity: 95001157632 (88.48 GB)
DFS Remaining: 95001149440 (88.48 GB)
DFS Used: 8192 (8 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
  • 查看hadoop使用的端口
netstat -tnulp | grep java
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 127.0.0.1:9000          0.0.0.0:*               LISTEN      21836/java
tcp        0      0 0.0.0.0:50090           0.0.0.0:*               LISTEN      32261/java
tcp        0      0 0.0.0.0:50070           0.0.0.0:*               LISTEN      21836/java
tcp6       0      0 :::8030                 :::*                    LISTEN      29011/java
tcp6       0      0 :::8031                 :::*                    LISTEN      29011/java
tcp6       0      0 :::8032                 :::*                    LISTEN      29011/java
tcp6       0      0 :::8033                 :::*                    LISTEN      29011/java
tcp6       0      0 :::8088                 :::*                    LISTEN      29011/java
  • 在浏览器上访问
http://hadoop-master:50070
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment