https://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html
1.首先下载想安装的Custom Service Descriptor和Parcel Repository
https://www.cloudera.com/documentation/spark2/latest/topics/spark2_packaging.html#versions
2.将Custom Service Descriptor jar包放在默认的本地描述符存储库路径/opt/cloudera/csd下,并修改用户,用户组,及权限,集群所有机器都要操作,从节点可能默认没有该目录,没有的话就mkdir
[root@hadoop001 ~]# cd /opt/cloudera/csd
[root@hadoop001 csd]# ll
total 16
-rw-r--r-- 1 root root 16109 Mar 10 10:53 SPARK2_ON_YARN-2.1.0.cloudera1.jar
[root@hadoop001 csd]# chown cloudera-scm:cloudera-scm SPARK2_ON_YARN-2.1.0.cloudera1.jar
[root@hadoop001 csd]# chmod 644 SPARK2_ON_YARN-2.1.0.cloudera1.jar
[root@hadoop001 csd]# ll
total 16
-rw-r--r-- 1 cloudera-scm cloudera-scm 16109 Mar 10 10:53 SPARK2_ON_YARN-2.1.0.cloudera1.jar
注:要保证集群的本地描述符存储库路径是/opt/cloudera/csd
查看方式:CM主界面管理–>设置–>类别(本地描述符存储库路径)
3.将parcel包上传到自定义的安装包文件夹下
[root@hadoop001 csd]# cd /var/www/html
[root@hadoop001 html]# mkdir spark2_parcels
[root@hadoop001 html]# cd spark2_parcels/
[root@hadoop001 spark2_parcels]# rz
[root@hadoop001 spark2_parcels]# ll
total 173048
-rw-r--r-- 1 root root 4677 Mar 10 11:15 manifest.json
-rw-r--r-- 1 root root 177185276 Mar 10 10:55 SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904-el7.parcel
-rw-r--r-- 1 root root 41 Mar 10 10:54 SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904-el7.parcel.sha1
[root@hadoop001 spark2_parcels]# mv SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904-el7.parcel.sha1 SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904-el7.parcel.sha
[root@hadoop001 spark2_parcels]# ll
total 173048
-rw-r--r-- 1 root root 4677 Mar 10 11:15 manifest.json
-rw-r--r-- 1 root root 177185276 Mar 10 10:55 SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904-el7.parcel
-rw-r--r-- 1 root root 41 Mar 10 10:54 SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904-el7.parcel.sha
4.重启cm server、agent和集群
[root@hadoop001 csd]# systemctl restart cloudera-scm-server
[root@hadoop001 csd]# systemctl restart cloudera-scm-agent
5.添加parcel包的本地路径
6.添加服务
7.为spark2添加一组依赖关系
8.角色分配
9.安装完成
但是有个问题:发现spark角色类型Gateway显示的状态为“不适用”
网上查阅资料说是只配置一个history server就够了,gateway不适用不影响spark的使用=_=,但是spark前没有绿色的√让人很不爽..
刚开始只在hadoop001机器的/opt/cloudera/csd目录下上传了SPARK2_ON_YARN-2.1.0.cloudera1.jar,怀疑是不是因为没有在从节点的该目录下上传该jar包。于是将jar包在hadoop002和hadoop003也上传,并修改权限,用户,用户组之后,重新安装了Spark2,安装完成后,重启cm service和cluster之后,该问题依然存在
暂且先不管了,如果有读者知道怎么解决,望不吝赐教。
启动spark-shell,测试spark能否正常启动
[root@hadoop001 ~]# cd /opt/cloudera/parcels/SPARK2/bin
[root@hadoop001 bin]# ll
total 12
-rwxr-xr-x 1 root root 692 Mar 29 2017 pyspark2
-rwxr-xr-x 1 root root 653 Mar 29 2017 spark2-shell
-rwxr-xr-x 1 root root 654 Mar 29 2017 spark2-submit
[root@hadoop001 bin]# ./spark2-shell --master local[2]
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://192.168.137.2:4040
Spark context available as 'sc' (master = local[2], app id = local-1552192309023).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.1.0.cloudera1
/_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_45)
Type in expressions to have them evaluated.
Type :help for more information.
scala>