Ambari Issue Highlights

不定期更新

收录各种奇葩问题


ambari安装之后,启动hive MetaStore时报错

File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 293, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of 'export HIVE_CONF_DIR=/usr/hdp/current/hive-metastore/conf/conf.server ; /usr/hdp/current/hive-metastore/bin/schematool -initSchema -dbType mysql -userName hive -passWord [PROTECTED]' returned 1.
WARNING: Use "yarn jar" to launch YARN applications.
Metastore connection URL:     jdbc:mysql://c6405.ambari.apache.org/hive?createDatabaseIfNotExist=true
Metastore Connection Driver :     com.mysql.jdbc.Driver
Metastore connection User:     hive
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
*** schemaTool failed ***

Solution:
hive配置的mysql登陆密码,与mysql设置的hive用户连接密码不一致,修改mysql或hive配置的密码,保持一致即可。

spark2.0 on yarn

1.jerseyNoClassDefFoundError

bin/spark-sql -driver-memory 10g --verbose --master yarn --packages com.databricks:spark-csv_2.10:1.3.0 --executor-memory 4g --num-executors 20 --executor-cores 2
16/05/09 13:15:21 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/05/09 13:15:21 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4041
16/05/09 13:15:21 INFO util.Utils: Successfully started service 'SparkUI' on port 4041.
16/05/09 13:15:21 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://bigaperf116.svl.ibm.com:4041
Exception in thread "main" java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig
at org.apache.hadoop.yarn.client.api.TimelineClient.createTimelineClient(TimelineClient.java:45)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:163)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
no issues
http://apache-spark-developers-list.1001551.n3.nabble.com/spark-2-0-issue-with-yarn-td17440.html

A temporary solution:
set yarn.timeline-service.enabled false to turn off ATS .

2.bad substitution

diagnostics: Application application_1441066518301_0013 failed 2 times due to AM Container for appattempt_1441066518301_0013_000002 exited with  exitCode: 1
For more detailed output, check application tracking page:http://localhost:8088/cluster/app/application_1441066518301_0013Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e03_1441066518301_0013_02_000001
Exit code: 1
Exception message: /mnt/yarn/nm/local/usercache/stack/appcache/
application_1441066518301_0013/container_e03_1441066518301_0013_02_000001/
launch_container.sh: line 24:$PWD:$PWD/__hadoop_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:
/usr/hdp/current/hadoop-client/*::$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:
/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-.6.0.${hdp.version}.jar:
/etc/hadoop/conf/secure: bad substitution
Stack trace: ExitCodeException exitCode=1: /mnt/yarn/nm/local/usercache/stack/appcache/application_1441066518301_0013/container_e03_1441066518301_0013_02_000001/launch_container.sh: line 24: $PWD:$PWD/__hadoop_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution

Solution:
此问题一般是由于手工安装组件而无法替换变量造成;
可修改 MapReduce2 组件配置项 mapreduce.application.classpath 中的 ${hdp.version} 为 hdp 绝对路径中的版本部分,eg. 2.4.0.0-169。

服务启动报错ulimit -c unlimited

resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ;  /usrp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usrp/current/hadoop-client/conf start namenode'' returned 1. -bash: line 0: ulimit: core file size: cannot modify limit: Operation not permitted
starting namenode, logging to ar/log/hadoopfs/hadoop-hdfs-namenode-wy1.jcloud.local.out

Solution:
CentOS7.1上启动HDFS的时候,在启动HDFS的namenode或者datanode的时候,非root启动的时候,会要求执行ulimit -c unlimited这个命令,但是执行的时候是su称hdfs帐号来启动,这时候因为hdfs帐号没有权限执行这个命令,所以会导致HDFS的namenode或者datanode启动失败,处理这个问题有一个办法就是改Ambari的代码,让HDFS启动过程不要执行ulimit -c unlimited命令。
需要修改的代码是:
编辑文件:

/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py

把这一行:

cmd = format("{ulimit_cmd} {hadoop_daemon} —config {hadoop_conf_dir} {action} {name}")

中的{ulimit_cmd}删除掉,删除之后重启Ambari-agent即可。

注册host报错

ERROR 2016-08-01 13:33:38,932 main.py:309 - Fatal exception occurred:
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/ambari_agent/main.py", line 306, in
main(heartbeat_stop_callback)
File "/usr/lib/python2.6/site-packages/ambari_agent/main.py", line 242, in main
stop_agent()
File "/usr/lib/python2.6/site-packages/ambari_agent/main.py", line 189, in stop_agent
sys.exit(1)
SystemExit: 1

Solution:
这是因为ambari默认用的ascii编码,如果你用中文版操作系统,可以在/usr/lib/python2.6/site-packages/ambari_agent/main.py 文件开头添加

import sys
reload(sys)
sys.setdefaultencoding('utf-8')

然后再retry failed就可以了

如何删除Ambari已有的服务

自定义服务SAMPLE后发现8080 web页面中没有删除的方法
Solution:

  1. 停止服务
curl -u admin:admin -H "X-Requested-By: ambari" -X PUT -d '{"RequestInfo": {"context":"Stop Service"},"Body":{"ServiceInfo":{"state":"INSTALLED"}}}' http://localhost:8080/api/v1/clusters/hadoop/services/SAMPLE

SAMPLE服务因为实际上没干任何事,短暂时间后可能会自己又启动,所以手速要快

  1. 删除服务(快速立即执行)
curl -u admin:admin -H "X-Requested-By: ambari" -X DELETE http://localhost:8080/api/v1/clusters/hadoop/services/SAMPLE

如果没有停止的话会出现

{
"status" : 500,
"message" : "org.apache.ambari.server.controller.spi.SystemException: An internal system exception occurred: Cannot remove hadoop/SAMPLE. MYMASTER is in anon-removable state."
}

没关系再次执行就好

  1. 验证
    重新访问8080 web页面,已经发现那个SAMPLE service已经消失了
  2. 再举几个例子:
    remove a host components from a host
curl -u admin:admin -i -H 'X-Requested-By: ambari' -X DELETE 'localhost:8080/api/v1/clusters/blueCluster/hosts/elk2.jcloud.local/host_components/FLUME_HANDLER'
curl -u admin:admin -i -H 'X-Requested-By: ambari' -X DELETE 'localhost:8080/api/v1/clusters/cluster/hosts/ochadoop10/host_components/NAMENODE'
curl -u admin:admin -i -H 'X-Requested-By: ambari' -X DELETE 'localhost:8080/api/v1/clusters/hbcm_ocdp/hosts/hbom-if-58/host_components/YARN_CLIENT'

install the components

curl -u admin:admin -i -H "X-Requested-By:ambari" -X POST 'localhost:8080/api/v1/clusters/hbcm_ocdp/hosts/hbbdc-dn-09/host_components/PHOENIX_QUERY_SERVER'
curl -u admin:admin -i -H "X-Requested-By:ambari" -X PUT 'localhost:8080/api/v1/clusters/hbcm_ocdp/hosts/hbbdc-dn-09/host_components/PHOENIX_QUERY_SERVER' -d '{"HostRoles": {"state": "INSTALLED"}}'

如何重置ambari的管理员密码

要想使用Ambari admin登陆,可以用以下办法重置admin的密码:

  1. Stop Ambari server
  2. Log on to ambari server host shell
  3. Run 'psql -U ambari ambari'
  4. Enter password **** (这是ambari连接到数据库时用的密码,默认是bigdata, 竟然以明文的形式,存储在文件/etc/ambari-server/conf/password.dat)
  5. In psql:
    update ambari.users set
    user_password='538916f8943ec225d97a9a86a2c6ec0818c1cd400e09e03b660fdaaec4af29ddbb6f2b1033b81b00'
    where user_name='admin';
  6. Quit psql: ctrl+D
  7. Run 'ambari-server restart'

User [dr.who] is not authorized to view the logs for application

在hadoop集群启用权限控制后,发现job运行日志的ui访问不了, User [dr.who] is not authorized to view the logs for application
Reason:
Resource Manager UI的默认用户dr.who权限不正确
Solution:
如果集群使用Ambari管理的话,在HDFS > Configurations > Custom core-site > Add Property
hadoop.http.staticuser.user=yarn
后台脚本修改配置:
获取配置信息:

/var/lib/ambari-server/resources/scripts/configs.sh get localhost hdp_cluster  hive-site|grep hive.server2.authenticatio
"hive.server2.authentication" : "NONE",
"hive.server2.authentication.spnego.keytab" : "HTTP/_HOST@EXAMPLE.COM",
"hive.server2.authentication.spnego.principal" : "/etc/security/keytabs/spnego.service.keytab",

修改配置信息:

/var/lib/ambari-server/resources/scripts/configs.sh set localhost hdp_cluster  hive-site hive.server2.authentication LDAP

ambari-sudo.sh /usr/bin/hdp-select错误

ambari-sudo.sh /usr/bin/hdp-select set all `ambari-python-wrap /usr/bin/hdp-select versions | grep ^2.4.0.0-169 | tail -1`'] {'only_if': 'ls -d /usr/hdp/2.4.0.0-169*

Solution:

  1. What happens when you run "hdp-select versions" from the command line, as root? Does it return your current 2.4 version number? If not, inspect your /usr/hdp and make sure you have only "current" and the directories named after your versions (2.4 and older ones if you did an upgrade) there. If you have any other file there, delete it, and retry, first "hdp-select versions" and then ATS.
  2. go to /usr/bin/
    vi hdp-select
    def printVersions():
......
......
 -    if f not in [".", "..", "current", "share", "lost+found"]:
 +    if f not in [".", "..", "current", "share", "lost+found","hadoop"]:
......
  1. 软连接冲突,删除多余软连接重试

HiveMetaStore or Hiveserver fails to come up

SupportKBSYMPTOMHiveServer2 fails to come up and error similar to the following is reported in hiveserver2.log file

2015-11-18 20:47:19,965 WARN  [main]: server.HiveServer2 (HiveServer2.java:startHiveServer2(442)) - Error starting HiveServer2 on attempt 4, will retry in 60 secondsorg.apache.hive.service.ServiceException: Failed to Start HiveServer2        
   at org.apache.hive.service.CompositeService.start(CompositeService.java:80)        
   at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:366)        
   at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:412)        
   at org.apache.hive.service.server.HiveServer2.access$700(HiveServer2.java:78)        
   at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:654)        
   at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:527)        
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)        
   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)        
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)        
   at java.lang.reflect.Method.invoke(Method.java:497)        
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)        
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.hive.service.ServiceException: Unable to connect to MetaStore!        
   at org.apache.hive.service.cli.CLIService.start(CLIService.java:154)        
   at org.apache.hive.service.CompositeService.start(CompositeService.java:70)        ... 11 more
Caused by: MetaException(message:Got exception: org.apache.hadoop.hive.metastore.api.MetaException javax.jdo.JDOException: Exception thrown when executing query        
   at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)        
   at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:230)        
   at org.apache.hadoop.hive.metastore.ObjectStore.getDatabases(ObjectStore.java:701)        
   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)        
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)        
   at java.lang.reflect.Method.invoke(Method.java:497)        
   at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)        
   at com.sun.proxy.$Proxy7.getDatabases(Unknown Source)        
   at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_databases(HiveMetaStore.java:1158)        
   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)        
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

HiveMetaStore fails to come up

2017-02-27 14:45:05,361 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:main(5908)) - Starting hive metastore on port 9083
2017-02-27 14:45:05,472 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(590)) - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2017-02-27 14:45:05,497 INFO  [main]: metastore.ObjectStore (ObjectStore.java:initialize(294)) - ObjectStore, initialize called
2017-02-27 14:45:06,193 ERROR [main]: DataNucleus.Datastore (Log4JLogger.java:error(115)) - Error : An error occurred trying to instantiate an instance of the adapter "org.datanucleus.store.rdbms.adapter.SQLAnywhereAdapter" for this JDBC driver : Class "org.datanucleus.store.rdbms.adapter.SQLAnywhereAdapter" was not found in the CLASSPATH. Please check your specification and your CLASSPATH.
Class "org.datanucleus.store.rdbms.adapter.SQLAnywhereAdapter" was not found in the CLASSPATH. Please check your specification and your CLASSPATH.
org.datanucleus.exceptions.ClassNotResolvedException: Class "org.datanucleus.store.rdbms.adapter.SQLAnywhereAdapter" was not found in the CLASSPATH. Please check your specification and your CLASSPATH.
   at org.datanucleus.ClassLoaderResolverImpl.classForName(ClassLoaderResolverImpl.java:216)
   at org.datanucleus.ClassLoaderResolverImpl.classForName(ClassLoaderResolverImpl.java:368)
   at org.datanucleus.ClassLoaderResolverImpl.classForName(ClassLoaderResolverImpl.java:391)
   at org.datanucleus.store.rdbms.adapter.DatastoreAdapterFactory.getAdapterClass(DatastoreAdapterFactory.java:226)
   at org.datanucleus.store.rdbms.adapter.DatastoreAdapterFactory.getNewDatastoreAdapter(DatastoreAdapterFactory.java:144)
   at org.datanucleus.store.rdbms.adapter.DatastoreAdapterFactory.getDatastoreAdapter(DatastoreAdapterFactory.java:92)
   at org.datanucleus.store.rdbms.RDBMSStoreManager.(RDBMSStoreManager.java:309)
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConst

ROOT CAUSE
AMBARI-12947 , BUG-44352
Post Ambari 2.1, up to Ambari 2.1.2, its mandatory to initialize datanucleus.rdbms.datastoreAdapterClassName in Hive Configs. This is
required only if SqlAnywhere database is used. There is no option in Ambari to delete this parameter.
RESOLUTION
Upgrade to Ambari 2.1.2.
WORKAROUND
Remove Hive configuration parameter 'datanucleus.rdbms.datastoreAdapterClassName' from hive-site using configs.sh
For eg

  1. Dump the hive-site parameters to a file
    /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p admin get Ambari_Hostname Ambari_ClusterName hive-site > /tmp/hive-site.txt
    This would dump/redirect all Ambari Hive configs parameter to /tmp/hive-site.txt
  2. Edit the /tmp/hive-site.txt template file created above and remove 'datanucleus.rdbms.datastoreAdapterClassname'. Also remove the
    lines before the 'properties' tag
  3. Set the hive-site parameters using /tmp/hive-site.txt
    /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p admin set Ambari_Hostname Ambari_ClusterName hive-site /tmp/hive-site.txt
  4. Start Hive Services
    This article created by Hortonworks Support (Article: 000003468) on 2015-11-25 06:07
    OS: Linux
    Type: Cluster_Administration
    Version: 2.1.0, 2.3.0
    Support ID: 000003468
    https://issues.apache.org/jira/browse/AMBARI-13114
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 194,524评论 5 460
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 81,869评论 2 371
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 141,813评论 0 320
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 52,210评论 1 263
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 61,085评论 4 355
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 46,117评论 1 272
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 36,533评论 3 381
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 35,219评论 0 253
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 39,487评论 1 290
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 34,582评论 2 309
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 36,362评论 1 326
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 32,218评论 3 312
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 37,589评论 3 299
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 28,899评论 0 17
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 30,176评论 1 250
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 41,503评论 2 341
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 40,707评论 2 335

推荐阅读更多精彩内容