现状
随着业务增长当前机器性能不满足业务需求,需要将Cloudera Manager Server迁移至其他主机。
环境
- 操作系统系统:Centos7
- JDK:1.7
- CDH 版本:5.8.4
操作步骤(内置Postgres版本)
1,前期环境检查
- hosts文件
- jdk版本
- 主机操作系统是否与版本匹配
- 主机名是否符合规范
2,安装新的cloudera manager server
# 从官网下载cloudera manager server安装文件
$ wget http://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin
# 添加可执行权限
$ chmod u+x cloudera-manager-installer.bin
# 执行安装(一路点击yes/next等待安装完毕)
$ sudo ./cloudera-manager-installer.bin
3,导出数据库数据,并备份
$ pg_dump -h localhost -p 7432 -U scm > scm_server_db_backup.$(date +%Y%m%d)
# scm 密码:/etc/cloudera-scm-server/db.properties
4,停止需迁移主机agent进程
$ sudo service cloudera-scm-agent stop
5,copy元数据至新的cms主机
# 如拷贝有文件冲突则先删除新主机原来目录下的文件
$ scp -r /var/lib/cloudera-scm-server/* CMSHOST:/var/lib/cloudera-manager-server/
6,初始化并启动新cloudera manager数据库
$ sudo service cloudera-scm-server-db start
7,新的cloudera manager数据库导入备份数据
$ psql -h localhost -p 7432 -U scm < scm_server_db_backup.20170424
# 注:
# scm 密码:/etc/cloudera-scm-server/db.properties
# postgre 密码: /var/lib/cloudera-scm-server-db/data/generated_password.txt [psql -h localhost -p 7432 -U cloudera-scm -d postgres]
8,启动cloudera manager server服务
$ sudo service cloudera-scm-server restart
# 检查日志是否有异常
$ tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
# 登录CMS前台web页面
# 用户名:admin 密码:admin
http://localhost:7180/
9,修改agent节点配置文件并启动agent进程(修改服务端IP)
# 修改agent server ip地址
$ sudo vi /etc/cloudera-scm-agent/config.ini
# Hostname of the CM server.
server_host=
# Port that the CM server is listening on.
server_port=7182
# 启动agent 进程
$ sudo service cloudera-manager-agent start
10,登录CM页面,删除安装Cloudera Management Service系列相关服务,并重新安装。
11,检查迁移agent的机器健康上报是否正常。
Mysql 版本(只讲数据库元数据迁移部分,其他部分与上面一致)
1,备份Mysql数据库
$ sudo mysqldump -u cm -p --databases cm > cm_backup.sql
2,Centos安装Mysql
$ sudo yum install mariadb-server mariadb
3,针对CM调优Mysql参数(官网搬过来的)
$ sudo vi /etc/my.cnf
transaction-isolation = READ-COMMITTED
key_buffer = 16M
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
log_bin = /var/lib/mysql/mysql_binary_log
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
4,启动Mysql
# 启动Mysql服务
$ sudo service mariadb start
# 初始化Mysql
$ sudo /usr/bin/mysql_secure_installation
5,初始化CM数据库配置
$ sudo /usr/share/cmf/schema/scm_prepare_database.sh mysql cm cm cm_password
$ sudo java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db
create user 'cm'@'localhost' IDENTIFIED BY 'cm_password';
6,创建CM元数据库,并导入备份数据
$ mysql -h localhost -u cm -p -e "create database cm";
$ mysql -h localhost -u cm -p -e "grant all privileges on cm.* to 'cm'@'localhost';
$ mysql -h localhost -u cm -p < cm_backup.sql
此方法迁移将丢失历史监控数据
Some Error
- cloudera manager agent报错信息:
ERROR Error, CM server guid updated, expected 50151af7-5eb3-4c4d-8ebe-02162986ade8, received d8f6ac55-a8d7-4c8f-a6d2-747f6e6c4f48
# 解决问题前提:agent版本与cm版本匹配
#(1)
sudo service cloudera-scm-agent stop
#(2)
sudo rm /var/lib/cloudera-scm-agent/*
#(3)
替换/etc/cloudera-scm-agent/config.ini 中的server_host=*
#(4)
sudo service cloudera-scm-agent start
- 集群扩容时
/var/log/cloudera-scm-agent/cloudera-scm-agent.log
出现报错:https ERROR Failed to retrieve/stroe URL: http://nfjd-hadoop02-node46.jpushoa.com:7180/cmf/parcel/download/CDH-5.7.3-1.cdh5.7.3.p0.5-el7.parcel.torrent -> /opt/cloudera/parcel-cache/CDH-5.7.3-1.cdh5.7.3.p0.5-el7.parcel.torrent HTTP Error 404: Not Found
# 解决问题前提:原cm主机所有文件依然保存
# 解决方案:将原CM主机的 /opt/cloudera/parcel-repo 目录下所有文件拷贝至新的CM主机对应目录,并修改owner为cloudera-scm
# (1)
scp -r /opt/cloudera/parcel-repo NEWCMS:/opt/cloudera/
# (2)
sudo chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo
# 如过主机文件已被删除可尝试去官方将之前用到的包拉取到此目录下。
官方文档
https://www.cloudera.com/documentation/enterprise/latest/topics/cm_ag_restore_server.html