快速搭建一个大数据的环境,我们可以使用Docker来实现,文章会演示如何使用。
使用方法
这里我会使用HDP的最新版本3.0.1
作为演示,包含的组件版本如下
组件 | 版本 |
---|---|
HDFS | 3.1.1 |
YARN | 3.1.1 |
MapReduce2 | 3.1.1 |
Tez | 0.9.1 |
Hive | 3.1.0 |
HBase | 2.0.0 |
Pig | 0.16.0 |
Sqoop | 1.4.7 |
Oozie | 4.3.1 |
ZooKeeper | 3.4.6 |
Storm | 1.2.1 |
Infra Solr | 0.1.0 |
Atlas | 1.0.0 |
Kafka | 1.1.1 |
Knox | 1.0.0 |
Ranger | 1.1.0 |
Spark2 | 2.3.1 |
Zeppelin Notebook | 0.8.0 |
Data Analytics Studio | 1.0.2.0.0 |
Druid | 1.12.1 |
Superset | 0.23.0 |
安装步骤
先下载docker镜像(打个预防针:大小占 26G 慢慢下吧哈,但收获很大)
docker pull hortonworks/sandbox-hdp:3.0.1
docker pull hortonworks/sandbox-proxy:1.0
下载启动配置脚本
git clone https://github.com/dounine/sandbox-hdp-3.0.1.git
添加hosts映射
127.0.0.1 sandbox-hdp.hortonworks.com
# 或者是使用公网IP或局域网IP
启动配置
./docker-deploy-hdp265.sh
成功结果
root@lake /s/d/sandbox-hdp-3.0.1# ./docker-deploy-hdp265.sh
+ registry=hortonworks
+ name=sandbox-hdp
+ version=3.0.1
+ proxyName=sandbox-proxy
+ proxyVersion=1.0
+ flavor=hdp
+ echo hdp
+ mkdir -p sandbox/proxy/conf.d
+ mkdir -p sandbox/proxy/conf.stream.d
+ docker pull hortonworks/sandbox-hdp:3.0.1
3.0.1: Pulling from hortonworks/sandbox-hdp
Digest: sha256:7b767af7b42030fb1dd0f672b801199241e6bef1258e3ce57361edb779d95921
Status: Image is up to date for hortonworks/sandbox-hdp:3.0.1
+ docker pull hortonworks/sandbox-proxy:1.0
1.0: Pulling from hortonworks/sandbox-proxy
Digest: sha256:42e4cfbcbb76af07e5d8f47a183a0d4105e65a1e7ef39fe37ab746e8b2523e9e
Status: Image is up to date for hortonworks/sandbox-proxy:1.0
+ '[' hdp == hdf ']'
+ '[' hdp == hdp ']'
+ hostname=sandbox-hdp.hortonworks.com
++ docker images
++ grep hortonworks/sandbox-hdp
++ awk '{print $2}'
+ version=3.0.1
+ docker network create cda
+ docker run --privileged --name sandbox-hdp -h sandbox-hdp.hortonworks.com --network=cda --network-alias=sandbox-hdp.hortonworks.com -d hortonworks/sandbox-hdp:3.0.1
46bf6b414dd3c0fb36a3816eac129219d30d49ea9421898158800e0ab3576048
+ echo ' Remove existing postgres run files. Please wait'
Remove existing postgres run files. Please wait
+ sleep 2
+ docker exec -t sandbox-hdp sh -c 'rm -rf /var/run/postgresql/*; systemctl restart postgresql;'
Failed to restart postgresql.service: Unit not found.
+ sed s/sandbox-hdp-security/sandbox-hdp/g assets/generate-proxy-deploy-script.sh
+ mv -f assets/generate-proxy-deploy-script.sh.new assets/generate-proxy-deploy-script.sh
+ chmod +x assets/generate-proxy-deploy-script.sh
+ assets/generate-proxy-deploy-script.sh
+ uname
+ grep MINGW
+ chmod +x sandbox/proxy/proxy-deploy.sh
+ sandbox/proxy/proxy-deploy.sh
7fa5c4d0737a6b71796fe997baf397d4078907d83fcfaa2a8c0f241772547147
需要先重置一下ambari的密码才能登录
docker exec -ti sandbox-hdp bash
ambari-admin-password-reset #重置密码登录
结果
[root@sandbox-hdp /]# ambari-admin-password-reset
Please set the password for admin:
Please retype the password for admin:
The admin password has been set.
Restarting ambari-server to make the password change effective...
Using python /usr/bin/python
Restarting ambari-server
Waiting for server stop...
Ambari Server stopped
Ambari Server running with administrator privileges.
Organizing resource files at /var/lib/ambari-server/resources...
Ambari database consistency check started...
Server PID at: /var/run/ambari-server/ambari-server.pid
Server out at: /var/log/ambari-server/ambari-server.out
Server log at: /var/log/ambari-server/ambari-server.log
Waiting for server start...................
Server started listening on 8080
DB configs consistency check: no errors and warnings were found.
输入帐号密码登录 ambar webUI http://localhost:8080