hadoop下载地址: http://apache.fayea.com/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
主要参考: http://www.open-open.com/lib/view/open1435761287778.html
我有三台一模一样的虚拟机,CentOS 7系统,
ip地址 | hostname |
---|---|
10.0.0.172 | node2 |
10.0.0.171 | slave171 |
10.0.0.185 | slave175 |
1)172可以ssh到10.0.0.172、10.0.0.171、10.0.0.185,无需密码
2)都关闭了防火墙
安装Java
下载jdk-8u102-linux-x64.tar,放到/home/java下面
tar -xvf jdk-8u102-linux-x64.tar
-
编辑/etc/profile,设置java环境变量
export JAVA_HOME=jdk1.8.0_102 export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export PATH=$PATH:$JAVA_HOME/bin
-
使配置生效,输入命令,
source /etc/profile
使用 java -version 查看安装结果
安装hadoop
- 下载“hadoop-2.7.3.tar.gz”,放到/home/hadoop目录下
- 解压,输入命令,tar -xzvf hadoop-2.7.3.tar.gz
- 在/home/hadoop目录下创建数据存放的文件夹,tmp、hdfs、hdfs/data、hdfs/name
配置/home/hadoop/hadoop-2.7.3/etc/hadoop目录下的core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://10.0.0.172:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hadoop/tmp</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131702</value>
</property>
</configuration>
name | value | Descrption |
---|---|---|
fs.defaultFS | hdfs://192.168.0.182:9000 | 定义HadoopMaster的URI和端口 |
io.file.buffer.size | 131702 | 用作序列化文件处理时读写buffer的大小 |
配置/home/hadoop/hadoop-2.7.3/etc/hadoop目录下的hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>10.0.0.2:9001</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<value>10.0.0.2:50090</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
配置/home/hadoop/hadoop-2.7.3/etc/hadoop目录下的mapred-site.xml
默认情况下,/home/hadoop/hadoop-2.7.0/etc/hadoop/文件夹下有mapred.xml.template文件,我们要复制该文件,并命名为mapred.xml,该文件用于指定MapReduce使用的框架。
复制并重命名
cp mapred-site.xml.template mapred-site.xml
然后编辑mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>10.0.0.2:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>10.0.0.2:19888</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>10.0.0.2:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>10.0.0.2:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>10.0.0.2:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>10.0.0.2:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>10.0.0.2:8088</value>
</property>
</configuration>
参考链接的原文中配置mapred-site.xml最后面有个属性
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>768</value>
</property>
加上会导致NodeManager启动失败,这个值小于启动NodeManager需要的大小,导致NodeManger启动失败。
配置hadoop-env.sh、yarn-env.sh的JAVA_HOME
找到这两个文件里export JAVA_HOME的那一行,改成下面就好。这两个文件在/home/hadoop/hadoop-2.7.3/etc/hadoop/下面。
export JAVA_HOME=/home/java/jdk1.8.0_102
配置slaves
$ vi slaves
10.0.0.171
10.0.0.185
启动
$ cd /home/hadoop/hadoop-2.7.3
// 初始化
$ bin/hdfs namenode -format
// 启动
$ sbin/start-dfs.sh
$ sbin/start-yarn.sh
或者
$ sbin/start-all.sh
查看运行结果jps
jps是一条java命令,查看运行的java程序
在DataNode上查看
$ jps
Jps
NodeManager
DataNode
在NameNode上查看
$ jps
jps
ResourceManager
SecondaryNameNode
NameNode
如果缺少了某一个比如DataNode,就代表出错了,去查看日志就行
集群环境测试
在hdfs上创建一个input目录
$ ./bin/hdfs dfs -mkdir /input
查看hdfs文件系统
$ ./bin/hdfs dfs -ls /
Found 3 items
drwxr-xr-x - root supergroup 0 2016-11-04 20:11 /data
drwxr-xr-x - root supergroup 0 2016-11-04 20:12 /input
drwx------ - root supergroup 0 2016-11-04 19:21 /tmp
在当前文件夹建一个input
$ mkdir ./input
往里面扔点东西
$ cp ./etc/hadoop/*.xml ./input
将本地的input文件中的内容上传到hdfs的/input中
$ ./bin/hdfs dfs -put input/* /input/
开始执行demo中grep
$ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep /input/ ./output 'dfs[a-z.]+'
16/11/09 14:40:12 INFO client.RMProxy: Connecting to ResourceManager at /10.0.0.172:8032
16/11/09 14:40:14 INFO input.FileInputFormat: Total input paths to process : 9
16/11/09 14:40:14 INFO mapreduce.JobSubmitter: number of splits:9
16/11/09 14:40:14 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1478665369284_0003
16/11/09 14:40:15 INFO impl.YarnClientImpl: Submitted application application_1478665369284_0003
16/11/09 14:40:16 INFO mapreduce.Job: The url to track the job: http://node2:8088/proxy/application_1478665369284_0003/
16/11/09 14:40:16 INFO mapreduce.Job: Running job: job_1478665369284_0003
16/11/09 14:40:32 INFO mapreduce.Job: Job job_1478665369284_0003 running in uber mode : false
16/11/09 14:40:32 INFO mapreduce.Job: map 0% reduce 0%
16/11/09 14:40:44 INFO mapreduce.Job: map 11% reduce 0%
16/11/09 14:40:51 INFO mapreduce.Job: map 22% reduce 0%
16/11/09 14:40:59 INFO mapreduce.Job: map 33% reduce 0%
16/11/09 14:41:07 INFO mapreduce.Job: map 44% reduce 0%
16/11/09 14:41:15 INFO mapreduce.Job: map 56% reduce 0%
16/11/09 14:41:22 INFO mapreduce.Job: map 67% reduce 0%
16/11/09 14:41:29 INFO mapreduce.Job: map 78% reduce 0%
16/11/09 14:41:35 INFO mapreduce.Job: map 89% reduce 0%
16/11/09 14:41:41 INFO mapreduce.Job: map 100% reduce 0%
16/11/09 14:41:49 INFO mapreduce.Job: map 100% reduce 100%
16/11/09 14:41:49 INFO mapreduce.Job: Job job_1478665369284_0003 completed successfully
16/11/09 14:41:49 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=231
FILE: Number of bytes written=1191235
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=29823
HDFS: Number of bytes written=353
HDFS: Number of read operations=30
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=9
Launched reduce tasks=1
Data-local map tasks=9
Total time spent by all maps in occupied slots (ms)=58974
Total time spent by all reduces in occupied slots (ms)=6870
Total time spent by all map tasks (ms)=58974
Total time spent by all reduce tasks (ms)=6870
Total vcore-milliseconds taken by all map tasks=58974
Total vcore-milliseconds taken by all reduce tasks=6870
Total megabyte-milliseconds taken by all map tasks=60389376
Total megabyte-milliseconds taken by all reduce tasks=7034880
Map-Reduce Framework
Map input records=841
Map output records=7
Map output bytes=211
Map output materialized bytes=279
Input split bytes=978
Combine input records=7
Combine output records=7
Reduce input groups=7
Reduce shuffle bytes=279
Reduce input records=7
Reduce output records=7
Spilled Records=14
Shuffled Maps =9
Failed Shuffles=0
Merged Map outputs=9
GC time elapsed (ms)=1039
CPU time spent (ms)=7620
Physical memory (bytes) snapshot=2067628032
Virtual memory (bytes) snapshot=20762001408
Total committed heap usage (bytes)=1490026496
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=28845
File Output Format Counters
Bytes Written=353
常见问题
未知的名称或服务错误
hostname与/etc/hosts中的不对应。
$ hostname
node2
$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
$ vi /etc/hosts
127.0.0.1 localhost localhost.localdomain node2 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
直接在/etc/hosts中加入ip和hostname的对应关系
127.0.0.1 localhost localhost.localdomain node2 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.0.0.185 Node185
修改/etc/hosts文件使之对应即可。
DataNode加载不成功,显示namenode:9000无法访问
使用lsof查看端口是否开放
$ lsof -i:9000
一般直接关闭防火墙.
INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.NoRouteToHostException: 没有到主机的路由
一般直接关闭防火墙.
Datanode denied communication with namenode because hostname cannot be resolved
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved
修改 hdfs-site.xml
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
错误:hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException:
在NameNode机器上运行
$ jps
78242 ResourceManager
77897 NameNode
78569 Jps
78088 SecondaryNameNode
没有DataNode的守护进程,去DataNode上运行jps,发现DataNode的守护进程,可以
$ sbin/hadoop-daemon.sh start datanode
79409 SecondaryNameNode
78595 RunJar
79218 NameNode
79563 ResourceManager
80075 Jps
79964 DataNode
问题没有了
上面其实还有问题,NodeManager一直起不来,就是因为内存设置的问题。
卡在runing
可以44配置yarn-site.xml
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>10.0.0.2:8030</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>20480</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.1</value>
</property>
Incompatible clusterIDs in /home/hadoop/dfs/data: namenode clusterID =
$ bin/stop-all.sh
$ rm -rf /home/hadoop/tmp/*
$ rm -rf /home/hadoop/hdfs/data/*
$ rm -rf /home/hadoop/hdfs/name/*
$ bin/hdfs namenode -format
hostname
cat yarn-root-resourcemanager-node2.log
Error: JAVA_HOME is not set and could not be found.
$ bin/hdfs namenode -format
Error: JAVA_HOME is not set and could not be found.
设置yarn-evn.sh、hadoop-env.sh中JAVA_HOME
设置日志级别
/home/hadoop/hadoop-2.7.3/sbin/hadoop-daemon.sh
修改日志级别
export HADOOP_ROOT_LOGGER=DEBUG,console
查看日志
日志路径:/home/hadoop/hadoop-2.7.3/logs
所有问题都可以在这里找到,有些是在slaves上,有些是在master上, 注意查看日志即可。