本节主要内容:
利用zookeeper模拟实现HA高可用
1.系统环境:
OS:CentOS Linux release 7.5.1804 (Core)
CPU:2核心
Memory:1GB
运行用户:root
JDK版本:1.8.0_252
Hadoop版本:cdh5.16.2
2.集群各节点角色规划为:
172.26.37.245 node1.hadoop.com---->namenode,zookeeper,journalnode,hadoop-hdfs-zkfc
172.26.37.246 node2.hadoop.com---->datanode,zookeeper,journalnode
172.26.37.247 node3.hadoop.com---->datanode
172.26.37.248 node4.hadoop.com---->namenode,zookeeper,journalnode,hadoop-hdfs-zkfc
172.26.37.248 node4.hadoop.com---->SecondaryNameNode取消
一.安装
node1.hadoop.com主机
# yum -y install zookeeper-server hadoop-hdfs-journalnode hadoop-hdfs-zkfc
node2.hadoop.com主机
# yum -y install zookeeper-server hadoop-hdfs-journalnode
node4.hadoop.com主机
# yum -y install zookeeper-server hadoop-hdfs-journalnode hadoop-hdfs-zkfc hadoop-hdfs-namenode
二.配置文件
1./etc/hadoop/conf/core-site.xml变更(所有节点)
# cp -p /etc/hadoop/conf/core-site.xml /etc/hadoop/conf/core-site.xml.20200617
# vi /etc/hadoop/conf/core-site.xml
变更为以下内容
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://cluster1</value>
<!--创建集群地址-->
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/tmp</value>
<!--声明hadoop的缓存文件目录-->
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>node1.hadoop.com:2181,node2.hadoop.com:2181,node4.hadoop.com:2181</value>
<!--声明那些主机部署zookeeper-->
</property>
</configuration>
2./etc/hadoop/conf/hdfs-site.xml变更(所有节点)
# cp -p /etc/hadoop/conf/hdfs-site.xml /etc/hadoop/conf/hdfs-site.xml.20200617
# vi /etc/hadoop/conf/hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>cluster1</value>
<!--自定义HDFS服务名-->
</property>
<property>
<name>dfs.ha.namenodes.cluster1</name>
<value>node1,node4</value>
<!--参与到HA的namemode节点名称;注意:这里自定义起名,后面后这些节点的解析-->
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.node1</name>
<!--这里是对node1这个节点的名称解析,同时指明rpc的端口-->
<value>node1.hadoop.com:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.node1</name>
<!--这里是对node1这个节点的名称解析,同时http管理地址的端口-->
<value>node1.hadoop.com:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.node4</name>
<value>node4.hadoop.com:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.node4</name>
<value>node4.hadoop.com:50070</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<!--HA当中的失败自动转移,默认是关闭的,手动打开-->
<value>true</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://node1.hadoop.com:8485;node2.hadoop.com:8485;node4.hadoop.com:8485/cluster1</value>
<!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
</property>
<property>
<name>dfs.client.failover.proxy.provider.cluster1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
<!-- 实现故障切换调用的类 -->
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/hadoop/data/journaldata/jn</value>
<!-- 指定JournalNode集群在对nameNode的目录进行共享时,存储数据的磁盘路径 -->
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>shell(/bin/true)
sshfence
</value>
<!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
<!-- 使用sshfence隔离机制时需要ssh免登陆 -->
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>10000</value>
<!--配置sshfence隔离机制超时时间-->
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>100</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///data/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.name.dir</name>
<value>file:///data/hdfs/data</value>
</property>
</configuration>
3./etc/zookeeper/conf/zoo.cfg(node1.2.4节点)
# cp -p /etc/zookeeper/conf/zoo.cfg /etc/zookeeper/conf/zoo.cfg.org
# vi /etc/zookeeper/conf/zoo.cfg
追加以下内容
server.1=node1.hadoop.com:2888:3888
server.2=node2.hadoop.com:2888:3888
server.3=node4.hadoop.com:2888:3888
<!--声明ha系统中,安装了zookeeper的主机和端口-->
每台主机写入对应的myid
node1
# echo "1" >/var/lib/zookeeper/myid
node2
# echo "2" >/var/lib/zookeeper/myid
node4
# echo "3" >/var/lib/zookeeper/myid
node1,针对上面的server.1;其他主机分别为2,3
三.启动服务
0.处理好datanode节点和原namenode节点服务;接前环境,关闭Node1的namenode服务
# service hadoop-hdfs-namenode stop
1.启动所有节点的zookeeper(node1.2.4)
# cd /var/lib/zookeeper
# mkdir version-2
# chown -R zookeeper:zookeeper /var/lib/zookeeper
# chmod -R 755 /var/lib/zookeeper
# service zookeeper-server start
# service zookeeper-server status
2.启动所有节点的journode(node1.2.4)
/home/hadoop/data/journaldata/jn 先创建
# mkdir -p /home/hadoop/data/journaldata/jn
# chown -R hdfs:hdfs /home/hadoop/data/journaldata/jn
# service hadoop-hdfs-journalnode start
# service hadoop-hdfs-journalnode status
3.在主节点node1上进行数据格式化
# sudo -u hdfs hdfs namenode -format //namenode节点格式化
# sudo -u hdfs hdfs zkfc -formatZK //初始化高可用,
4.namenode节点上启用namenode服务(node1.4)
# service hadoop-hdfs-namenode start
# service hadoop-hdfs-namenode status
5.namenode节点上启动zkfc服务(node1.4)
# service hadoop-hdfs-zkfc start
# service hadoop-hdfs-zkfc status
6.备份节点上执行数据同步
# sudo -u hdfs hdfs namenode -bootstrapStandby
7.数据同步后登陆不同namenode:50070
一个为active
一个为standby
如果为双active或双standby,请检查配置及启动顺序
8.node1关闭active节点的namenode服务
# service hadoop-hdfs-namenode stop
# service hadoop-hdfs-namenode status
备份主动变为active
9.node1启动namenode服务,状态不会抢占为active,变为stanby
# service hadoop-hdfs-namenode start
# service hadoop-hdfs-namenode status