HDP3.1.5 默认spark版本2.4 ,希望将spark升级到3
1、去spark官网上下载spark3版本,选择spark3-hadoop3.2 ,虽然hadoop版本高于hdp3.15没关系,不然spark-sql不能用。
2、把下载好的spark3目录放在 /usr/hdp/3.1.5-xx/spark3 (放啥地方都行)
3、拷贝/etc/spark2/conf/spark-defaults.conf 、/etc/spark2/conf/spark-env.sh、hive-site.xml 到 spark3/conf 下
4、在spark-defaults.conf 添加 下面参数
spark.driver.extraJavaOptions -Dhdp.version=3.1.5.0-152
spark.yarn.am.extraJavaOptions -Dhdp.version=3.1.5.0-152
spark.yarn.archive hdfs:///hdp/apps/spark3/spark3.zip
5、到/usr/hdp/3.1.5-xx/spark3/jars目录下
zip spark3.zip ./*
hdfs dfs -put ./spark3.zip hdfs:///hdp/apps/spark3/
6、在spark3/conf目录下的spark-env.sh 添加
export SPARK_DIST_CLASSPATH=$(hadoop classpath)
下面是已经存储需要修改成正确的
export SPARK_CONF_DIR=${SPARK_CONF_DIR:-/usr/hdp/3.1.5.0-152/spark3/conf/}
7、测试
/usr/hdp/3.1.5-xx/spark3/bin/spark-sql
/usr/hdp/3.1.5-xx/spark3/bin/spark-submit
Note: 这种多版本共存的版本,需要在gateway的机器上操作。
进入到spark3目录,执行bin/spark-submit 就是spark3
进入/usr/hdp/3.1.5-xx/spark2 目录,执行bin/spark-submit 就是 spark2
互相不干扰,工作节点不需要调整。
升级完以后spark3 cluster模式提交任务报错。
15/09/01 21:54:05 INFO yarn.Client: Application report for application_1441066518301_0013 (state: ACCEPTED)
15/09/01 21:54:05 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1441144443866
final status: UNDEFINED
tracking URL: http://yarnmaster-8245.lvs01.dev.ebayc3.com:8088/proxy/application_1441066518301_0013/
user: stack
15/09/01 21:54:06 INFO yarn.Client: Application report for application_1441066518301_0013 (state: ACCEPTED)
15/09/01 21:54:10 INFO yarn.Client: Application report for application_1441066518301_0013 (state: FAILED)
15/09/01 21:54:10 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1441066518301_0013 failed 2 times due to AM Container for appattempt_1441066518301_0013_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://yarnmaster-8245.lvs01.dev.ebayc3.com:8088/cluster/app/application_1441066518301_0013Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e03_1441066518301_0013_02_000001
Exit code: 1
Exception message: /mnt/yarn/nm/local/usercache/stack/appcache/
application_1441066518301_0013/container_e03_1441066518301_0013_02_000001/
launch_container.sh: line 24: $PWD:$PWD/__hadoop_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:
/usr/hdp/current/hadoop-client/*::$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:
/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-.6.0.${hdp.version}.jar:
/etc/hadoop/conf/secure: bad substitution
Stack trace: ExitCodeException exitCode=1: /mnt/yarn/nm/local/usercache/stack/appcache/application_1441066518301_0013/container_e03_1441066518301_0013_02_000001/launch_container.sh: line 24: $PWD:$PWD/__hadoop_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
需要再spark3/conf目录下创建一个文件,叫java-opts内容-Dhdp.version=xxx
版本可以通过hdp-select status hadoop-client 查看