环境准备
1.修改静态iP
2.修改hosts
3.关闭防火墙
4.创建用户和权限
- 5.安装jdk和Hadoop
- 卸载现有JDK
(1)查询是否安装Java软件:
- 卸载现有JDK
[atguigu@hadoop101 opt]$ rpm -qa | grep java
(2)如果安装的版本低于1.7,卸载该JDK:
[atguigu@hadoop101 opt]$ sudo rpm -e 软件包
(3)查看JDK安装路径:
[atguigu@hadoop101 ~]$ which java
6.配置环境变量
#JAVA_HOME
export JAVA_HOME=/opt/module/jdk1.8.0_144
export PATH=$PATH:$JAVA_HOME/bin
#HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-2.7.2
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
source /etc/profile
hadoop101 | hadoop102 | hadoop103 |
---|---|---|
NameNode | ResourceManager | SecondaryNameNode |
DataNode | DataNode | DataNode |
NodeManager | NodeManager | NodeManager |
- YARN
- ResourceManager(1)
- HDFS
- NameNode(1)
- DataNode and NodeManager (3)
- SecondaryNameNode(1)
配置集群(三台搭建)
1)准备3台客户机(关闭防火墙、静态ip、主机名称)
2)安装JDK
3)配置环境变量
4)安装Hadoop
5)配置环境变量
6)配置集群
7)单点启动
8)配置ssh
9)群起并测试集群
(1)核心配置文件
配置core-site.xml
[atguigu@hadoop102 hadoop]$ vi core-site.xml
在该文件中编写如下配置
<! -- 指定HDFS中NameNode的地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop102:9000</value>
</property>
<! -- 指定Hadoop运行时产生文件的存储目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/module/hadoop-2.7.2/data/tmp</value>
</property>
(2)HDFS配置文件
配置hadoop-env.sh
[atguigu@hadoop102 hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/opt/module/jdk1.8.0_144
配置hdfs-site.xml
[atguigu@hadoop102 hadoop]$ vi hdfs-site.xml
在该文件中编写如下配置
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<! -- 指定Hadoop辅助名称节点主机配置 -->
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop104:50090</value>
</property>
(3)YARN配置文件
配置yarn-env.sh
[atguigu@hadoop102 hadoop]$ vi yarn-env.sh
export JAVA_HOME=/opt/module/jdk1.8.0_144
配置yarn-site.xml
[atguigu@hadoop102 hadoop]$ vi yarn-site.xml
在该文件中增加如下配置
<! -- Reducer获取数据的方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<! -- 指定YARN的ResourceManager的地址 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop103</value>
</property>
(4)MapReduce配置文件
配置mapred-env.sh
[atguigu@hadoop102 hadoop]$ vi mapred-env.sh
export JAVA_HOME=/opt/module/jdk1.8.0_144
配置mapred-site.xml
[atguigu@hadoop102 hadoop]$ cp mapred-site.xml.template mapred-site.xml
<! -- 指定MR运行在Yarn上 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
启动命令
- hdfs namenode -format
2.hadoop-daemon.sh start namenode
2.hadoop-daemon.sh start datanode - start-yarn.sh
4.start-dfs.sh
看输出日志
export HADOOP_ROOT_LOGGER=INFO,console
export HADOOP_ROOT_LOGGER=DEBUG,console
/etc/hadoop/log4j.properties
log4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR
同步时间
rpm -qa|grep ntp
service ntpd stop
chkconfig ntpd off
vim /etc/ntp.conf
1.取消注释
# restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap
2.全部注释
server 0.centos.pool.ntp.org iburst
server 1.centos.pool.ntp.org iburst
server 2.centos.pool.ntp.org iburst
server 3.centos.pool.ntp.org iburst
3.添加
server 127.127.1.0
fudge 127.127.1.0 stratum 10
4.vim /etc/sysconfig/ntpd
YNC_HWCLOCK=yes
crontab -e
*/10 * * * * /usr/sbin/ntpdate hadoop102