Hadoop2namenodeHA的示例分析

这篇文章主要介绍Hadoop2 namenode HA的示例分析,文中介绍的非常详细,具有一定的参考价值,感兴趣的小伙伴们一定要看完!

创新互联是一家专注网站建设、网络营销策划、微信小程序开发、电子商务建设、网络推广、移动互联开发、研究、服务为一体的技术型公司。公司成立十多年以来,已经为数千家垃圾桶各业的企业公司提供互联网服务。现在,服务的数千家客户与我们一路同行,见证我们的成长;未来,我们一起分享成功的喜悦。

实验的Hadoop版本为2.5.2,硬件环境是5台虚拟机,使用的均是CentOS6.6操作系统,虚拟机IP和hostname分别为:
192.168.63.171    node1.zhch
192.168.63.172    node2.zhch
192.168.63.173    node3.zhch
192.168.63.174    node4.zhch
192.168.63.175    node5.zhch

ssh免密码、防火墙、JDK这里就不在赘述了。虚拟机的角色分配是 node1为 主namenode 节点,node2为 备namendoe节点,node3、4、5为 datanode节点;node1、2、3上还将部署 zookeeper 和 journalnode。

一、搭建Zookeeper集群
 Storm0.9.4安装 中搭建Zookeeper集群的部分

[yyl@node1 ~]$ zkServer.sh start
JMX enabled by default
Using config: /home/yyl/program/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[yyl@node1 ~]$ zkServer.sh status
JMX enabled by default
Using config: /home/yyl/program/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: follower
[yyl@node2 ~]$ zkServer.sh start
JMX enabled by default
Using config: /home/yyl/program/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[yyl@node2 ~]$ zkServer.sh status
JMX enabled by default
Using config: /home/yyl/program/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: leader
[yyl@node3 ~]$ zkServer.sh start
JMX enabled by default
Using config: /home/yyl/program/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[yyl@node3 ~]$ zkServer.sh status
JMX enabled by default
Using config: /home/yyl/program/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: follower

二、配置Hadoop环境
 

## 解压
[yyl@node1 program]$ tar -zxf hadoop-2.5.2.tar.gz 
## 创建文件夹
[yyl@node1 program]$ mkdir hadoop-2.5.2/name
[yyl@node1 program]$ mkdir hadoop-2.5.2/data
[yyl@node1 program]$ mkdir hadoop-2.5.2/journal
[yyl@node1 program]$ mkdir hadoop-2.5.2/tmp

## 配置hadoop-env.sh
[yyl@node1 program]$ cd hadoop-2.5.2/etc/hadoop/
[yyl@node1 hadoop]$ vim hadoop-env.sh
export JAVA_HOME=/usr/lib/java/jdk1.7.0_80

## 配置yarn-env.sh
[yyl@node1 hadoop]$ vim yarn-env.sh
export JAVA_HOME=/usr/lib/java/jdk1.7.0_80

## 配置slaves
[yyl@node1 hadoop]$ vim slaves
node3.zhch
node4.zhch
node5.zhch

## 配置core-site.xml
[yyl@node1 hadoop]$ vim core-site.xml


  fs.defaultFS
  hdfs://mycluster


  io.file.buffer.size
  131072 


  hadoop.tmp.dir
  file:/home/yyl/program/hadoop-2.5.2/tmp


  hadoop.proxyuser.hadoop.hosts
  *


  hadoop.proxyuser.hadoop.groups
  *


  ha.zookeeper.quorum
  node1.zhch:2181,node2.zhch:2181,node3.zhch:2181


  ha.zookeeper.session-timeout.ms
  1000



## 配置hdfs-site.xml
[yyl@node1 hadoop]$ vim hdfs-site.xml


  dfs.namenode.name.dir
  file:/home/yyl/program/hadoop-2.5.2/name


  dfs.datanode.data.dir
  file:/home/yyl/program/hadoop-2.5.2/data


  dfs.replication
  1


  dfs.webhdfs.enabled
  true


  dfs.permissions
  false


  dfs.permissions.enabled
  false


  dfs.nameservices
  mycluster


  dfs.ha.namenodes.mycluster
  nn1,nn2


  dfs.namenode.rpc-address.mycluster.nn1
  node1.zhch:9000


  dfs.namenode.rpc-address.mycluster.nn2
  node2.zhch:9000


  dfs.namenode.servicerpc-address.mycluster.nn1
  node1.zhch:53310


  dfs.namenode.servicerpc-address.mycluster.nn2
  node2.zhch:53310


  dfs.namenode.http-address.mycluster.nn1
  node1.zhch:50070


  dfs.namenode.http-address.mycluster.nn2
  node2.zhch:50070


  dfs.namenode.shared.edits.dir
  qjournal://node1.zhch:8485;node2.zhch:8485;node3.zhch:8485/mycluster


  dfs.client.failover.proxy.provider.mycluster
  org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider


  dfs.ha.fencing.methods
  sshfence


  dfs.ha.fencing.ssh.private-key-files
  /home/yyl/.ssh/id_rsa


  dfs.ha.fencing.ssh.connect-timeout
  30000


  dfs.journalnode.edits.dir
  /home/yyl/program/hadoop-2.5.2/journal


  dfs.ha.automatic-failover.enabled
  true


  ha.failover-controller.cli-check.rpc-timeout.ms
  60000


  ipc.client.connect.timeout
  60000


  dfs.image.transfer.bandwidthPerSec
  4194304



## 配置mapred-site.xml
[yyl@node1 hadoop]$ cp mapred-site.xml.template mapred-site.xml
[yyl@node1 hadoop]$ vim mapred-site.xml


  mapreduce.framework.name
  yarn

 
  mapreduce.jobhistory.address 
  node1.zhch:10020,node2.zhch:10020 
 
 
  mapreduce.jobhistory.webapp.address 
  node1.zhch:19888,node2.zhch:19888 



## 配置yarn-site.xml
[yyl@node1 hadoop]$ vim yarn-site.xml 


  yarn.nodemanager.aux-services
  mapreduce_shuffle


  yarn.nodemanager.aux-services.mapreduce.shuffle.class
  org.apache.hadoop.mapred.ShuffleHandler


  yarn.resourcemanager.address
  node1.zhch:8032


  yarn.resourcemanager.scheduler.address
  node1.zhch:8030


  yarn.resourcemanager.resource-tracker.address
  node1.zhch:8031


  yarn.resourcemanager.admin.address
  node1.zhch:8033


  yarn.resourcemanager.webapp.address
  node1.zhch:8088



## 分发到各个节点
[yyl@node1 hadoop]$ cd /home/yyl/program/
[yyl@node1 program]$ scp -rp hadoop-2.5.2 yyl@node2.zhch:/home/yyl/program/
[yyl@node1 program]$ scp -rp hadoop-2.5.2 yyl@node3.zhch:/home/yyl/program/
[yyl@node1 program]$ scp -rp hadoop-2.5.2 yyl@node4.zhch:/home/yyl/program/
[yyl@node1 program]$ scp -rp hadoop-2.5.2 yyl@node5.zhch:/home/yyl/program/
## 在各个节点上设置hadoop环境变量
[yyl@node1 ~]$ vim .bash_profile 
export HADOOP_PREFIX=/home/yyl/program/hadoop-2.5.2
export HADOOP_COMMON_HOME=$HADOOP_PREFIX
export HADOOP_HDFS_HOME=$HADOOP_PREFIX
export HADOOP_MAPRED_HOME=$HADOOP_PREFIX
export HADOOP_YARN_HOME=$HADOOP_PREFIX
export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop
export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin

三、创建znode
把各个 zookeeper 启动,在其中一个 namenode 节点执行如下命令,用于在 Zookeeper 中创建一个 znode

[yyl@node1 ~]$ hdfs zkfc -formatZK
## 验证创建是否成功:
[yyl@node3 ~]$ zkCli.sh 
[zk: localhost:2181(CONNECTED) 0] ls /
[hadoop-ha, zookeeper]
[zk: localhost:2181(CONNECTED) 1] ls /hadoop-ha
[mycluster]
[zk: localhost:2181(CONNECTED) 2]

四、启动journalnode
在 node1.zhch、node2.zhch、node3.zhch 上运行命令:hadoop-daemon.sh start journalnode

[yyl@node1 ~]$ hadoop-daemon.sh start journalnode
starting journalnode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-journalnode-node1.zhch.out
[yyl@node1 ~]$ jps
1126 QuorumPeerMain
1349 JournalNode
1395 Jps
[yyl@node2 ~]$ hadoop-daemon.sh start journalnode
starting journalnode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-journalnode-node2.zhch.out
[yyl@node2 ~]$ jps
1524 JournalNode
1570 Jps
1376 QuorumPeerMain
[yyl@node3 ~]$ hadoop-daemon.sh start journalnode
starting journalnode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-journalnode-node3.zhch.out
[yyl@node3 ~]$ jps
1289 JournalNode
1126 QuorumPeerMain
1335 Jps

五、NameNode

## 在 主namenode 节点上使用命令 hadoop namenode -format 格式化 namenode 和 journalnode 目录
[yyl@node1 ~]$ hadoop namenode -format

## 启动主namenode
[yyl@node1 ~]$ hadoop-daemon.sh start namenode
starting namenode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-namenode-node1.zhch.out
[yyl@node1 ~]$ jps
1478 NameNode
1561 Jps
1126 QuorumPeerMain
1349 JournalNode

## 在 备namenode节点 同步元数据
[yyl@node2 ~]$ hdfs namenode -bootstrapStandby

## 启动 备NameNode
[yyl@node2 ~]$ hadoop-daemon.sh start namenode
starting namenode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-namenode-node2.zhch.out
[yyl@node2 ~]$ jps
1524 JournalNode
1626 NameNode
1709 Jps
1376 QuorumPeerMain
## 在 两个namenode节点 都执行以下命令来配置自动故障转移:安装和运行ZKFC
[yyl@node1 ~]$ hadoop-daemon.sh start zkfc
starting zkfc, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-zkfc-node1.zhch.out
[yyl@node1 ~]$ jps
1624 DFSZKFailoverController
1478 NameNode
1682 Jps
1126 QuorumPeerMain
1349 JournalNode
[yyl@node2 ~]$ hadoop-daemon.sh start zkfc
starting zkfc, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-zkfc-node2.zhch.out
[yyl@node2 ~]$ jps
1524 JournalNode
1746 DFSZKFailoverController
1626 NameNode
1800 Jps
1376 QuorumPeerMain


六、启动 DataNode 和 Yarn

[yyl@node1 ~]$ hadoop-daemons.sh start datanode
node4.zhch: starting datanode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-datanode-node4.zhch.out
node3.zhch: starting datanode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-datanode-node3.zhch.out
node5.zhch: starting datanode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-datanode-node5.zhch.out

[yyl@node1 ~]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/yyl/program/hadoop-2.5.2/logs/yarn-yyl-resourcemanager-node1.zhch.out
node4.zhch: starting nodemanager, logging to /home/yyl/program/hadoop-2.5.2/logs/yarn-yyl-nodemanager-node4.zhch.out
node3.zhch: starting nodemanager, logging to /home/yyl/program/hadoop-2.5.2/logs/yarn-yyl-nodemanager-node3.zhch.out
node5.zhch: starting nodemanager, logging to /home/yyl/program/hadoop-2.5.2/logs/yarn-yyl-nodemanager-node5.zhch.out
[yyl@node1 ~]$ jps
1763 ResourceManager
1624 DFSZKFailoverController
1478 NameNode
1126 QuorumPeerMain
1349 JournalNode
2028 Jps
[yyl@node3 ~]$ jps
1289 JournalNode
1462 NodeManager
1367 DataNode
1126 QuorumPeerMain
1559 Jps

下次启动的时候,在 zookeeper 集群已启动的前提下,直接执行以下命令就可以全部启动所有进程和服务了:
sh start-dfs.sh
sh start-yarn.sh

可以通过URL来查看namenode状态
http://node1.zhch:50070    http://node2.zhch:50070
也可以通过命令来查看
[yyl@node1 ~]$ hdfs haadmin -getServiceState nn1
active
[yyl@node1 ~]$ hdfs haadmin -getServiceState nn2
standby

七、测试
在主namenode机器上通过jps命令查找到namenode的进程号,然后通过kill -9的方式杀掉进程,观察另一个namenode节点是否会从状态standby变成active状态:

[yyl@node1 ~]$ jps
1763 ResourceManager
1624 DFSZKFailoverController
1478 NameNode
2128 Jps
1126 QuorumPeerMain
1349 JournalNode
[yyl@node1 ~]$ kill -9 1478
[yyl@node1 ~]$ hdfs haadmin -getServiceState nn2
active
[yyl@node1 ~]$ hadoop-daemon.sh start namenode
starting namenode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-namenode-node1.zhch.out
[yyl@node1 ~]$ hdfs haadmin -getServiceState nn1
standby

以上是“Hadoop2 namenode HA的示例分析”这篇文章的所有内容,感谢各位的阅读!希望分享的内容对大家有帮助,更多相关知识,欢迎关注创新互联行业资讯频道!


新闻名称:Hadoop2namenodeHA的示例分析
本文地址:http://scyanting.com/article/joppjp.html